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(54) Title: METHOD FOR SCREENING RESTRICTION ENDONUCLEASES 
(57) Abstract 

A method is provided for identifying a restriction endonuclease, which includes the steps of (a) screening a target DNA sequence 
for the presence of known methylase sequence motifs, (b) identifying any open reading frames which lie close to the methylase sequence 
motifs screened in step (a), and (c) assaying the protein products of these open reading frames for restriction endonuclease activity. 
Methods for identifying isoschizomers of known restriction endonucleases, which isoschizomers possess a desired physical property, such 
as thermostability, are also provided by the present invention, as are several novel restriction endonucleases isolated from M. jannaschii, 
Afjalll and MjalV. Additionally, a gene was identified mat encoded a previously observed endonuclease activity, designated JlQalL Also 
provided by the present invention are vectors suitable for cloning a DNA sequence encoding a cytotoxic protein, via independent transcription 
promoters which may be selectively controlled by several conditions. A method for producing these cytotoxic proteins using such vectors 
is also provided, as are stable clones of Pad and NiaBL 
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METHOD FOR SCREENING RESTRICTION ENDONUCLEASES 

RAHKfiRQUND OF THE flMVEMTflON 

The present invention relates to a novel method for 
screening and identifying restriction endonucleases based on 
the proximity of their genes to the genes of their cognate 
methylases. A similar method for identifying isoschizomers of 
known endonucleases, which isoschizomers possess a desired 
physical property is also provided. Related methods for 
producing and cloning such endonucleases or other cytotoxic 
proteins are provided, as are several novel M. jannaschii 
restriction endonucleases. 

Nucleases are a class of enzymes which degrade or cut 
single- or double-stranded .DNA. Restriction endonucleases are 
an important class of nucleases which recognize and bind to 
particular sequences of nucleotides (the 'recognition 
sequence') along the DNA molecule. Once bound, they cleave 
both strands of the molecule within, or to one side of, the 
recognition sequence. Different restriction endonucleases 
recognize different recognition sequences. Over two hundred 
restriction endonucleases with unique specificities have been 
identified among the many hundreds of bacterial and archaeal 
species that have been examined to date. Some have also been 
found to be encoded by eukaryotic viruses. 



It is thought that in nature, restriction endonucleases, 
which comprise the first component of what are commonly 
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referred to as restriction-modification ("RM") systems, play a 
protective role in the welfare of the host cell. They enable 
bacteria and archaea to resist infection by foreign DNA 
molecules like viruses and plasmids that would otherwise 
destroy or parasitize them. They impart resistance by cleaving 
invading foreign DNA molecules when the appropriate 
recognition sequence is present. The cleavage that takes place 
disables many of the infecting genes and renders the DNA 
susceptible to further degradation by non-specific 
endonucleases. 

A second component of these bacterial and archaeal 
protective systems are the modification methylases. These 
enzymes are complementary to the restriction endonucleases 
and they provide the means by which bacteria and archaea are 
able to protect their own DNA from cleavage and distinguish it 
from foreign, infecting DNA. Usually, modification methylases 
recognize and bind to the same nucleotide recognition sequence 
as the corresponding restriction endonuclease, but instead of 
cleaving the DNA, they chemically modify one or other of the 
nucleotides within the sequence by the addition of a methyl 
group. Following methylation, the recognition sequence is no 
longer bound or cleaved by the restriction endonuclease. The 
DNA of the host cell is always fully modified by virtue of the 
activity of the modification methylase. It is therefore 
completely insensitive to the presence of the endogenous 
restriction endonuclease. It is only unmodified, and therefore 
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identifiably foreign DNA, that is sensitive to restriction 
endonuclease recognition and cleavage. 

There are three kinds of restriction systems. The Type I 
systems are complex. They recognize specific sequences, but 
cleave randomly with respect to that sequence (Bickle, T.A;, 
Nucleases [eds. Linn, S.M., Lloyd, S.L, and Roberts, R.J.], Cold 
Spring Harbor Laboratory Press, pp. 89-109, (1993)). The Type 
III enzymes, of which only five have been characterized 
biochemically, recognize specific sequences, cleave at a 
precise point away from that sequence, but rarely give 
complete digestion (ibid). Neither of these two kinds of 
systems are suitable for genetic engineering, which is the sole 
province of the Type II systems. The latter recognize a 
specific sequence and cleave precisely either within or very 
close to that sequence. They typically only require Mg ++ for 
their action. 

The traditional approaches to screening for restriction 
endonucleases, pioneered by Roberts et al. and others in the 
early to mid 1970's (e.g. Smith, H.O. and Wilcox, K.W., J. Mol. 
Biol. 51:379-391 (1970); Kelly, TJ. Jr. and Smith, H.O., J. Mol. 
Biol. 51:393-409, (1970); Middleton, J.H. et al., J. Virol. 10:42- 
50 (1972); and Roberts, R.J. et al., J. Mol. Biol. 91:121-123, 
(1975)), was to grow small cultures of individual strains, 
prepare cell extracts and then test the crude cell extracts for 
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their ability to produce specific fragments on small DNA 
molecules (see Schildkraut, I.S., "Screening for and 
Characterizing Restriction Endonucleases", in Genetic 
Engineering, Principles and Methods, Vol. 6, pp. 117-140, 
Plenum Press (1984)). Using this approach, about 12,000 
strains have been screened worldwide to yield the current 
harvest of almost 3,000 restriction endonucleases (Roberts, 
R.J. and Macelis, D., Nucl. Acids. Res. 26:338-350 (1998)). 
Roughly, one in four of all strains examined, using a 
biochemical approach, shows the presence of a Type II 
restriction enzyme. 

Beginning in 1978, investigators in a number of 
laboratories set about to clone the genes for some of the Type 
II restriction systems (Szomolanyi, I. et al., Gene 10:219-225 
(1980)). This promised to be quite a successful enterprise 
because of the ease of selecting for methylase genes (Mann, 
M.B. et al., Gene 3:97-112 (1978); Kiss, A.M. et al., Nucl. Acids. 
Res. 13:6403-6420 (1985)). Basically, if an organism is known 
to contain a restriction system, then a shotgun of the 
organism's DNA can be made and the resulting mixed population 
of plasmids can be grown as a single, mixed culture. This 
mixed population of plasmid DNA's is then isolated, cleaved in 
vitro with the restriction enzyme, and only those plasmids 
that have both received and expressed the corresponding 
methylase gene, will survive the digestion. Upon 
retransformation, any cells that grow are greatly enriched for 
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the presence of the methylase gene. Because the methylase 
and restriction enzyme genes are usually adjacent, this method 
can yield both genes. Sometimes a single round of selection is 
sufficient, but routinely two rounds of selection yield the 
required methylase gene with high efficiency. Only when 
expression of the methylase gene is poor or coexpression of 
flanking sequences is lethal does the selection fail. Various 
tricks and alternative cloning methods have been developed to 
overcome such limitations (e.g. Brooks, J.E. et al., Nucl. Acids. 
Res. 17:979-997 (1989); Wilson, G.G. and Meda, M.M., U.S. Patent 
5,179,015 (1993)). 

As the skilled artisan will appreciate restriction 
endonucleases are cytotoxic products. In general, genes 
encoding cytotoxic products are extremely difficult to clone, 
even when care has been taken to remove sequences that might 
enable their expression in the plasmid host. Generation of 
their mRNA can be due to 'read-through' transcription that 
originates at some point on the plasmid other than the toxic 
locus. Absent an identifiable Shine-Dalgarno (SD) consensus 
sequence upstream of an initiator codon, translation of the 
toxic protein may be initiated by a cryptic ribosome binding 
site (RBS) (by definition, not fitting the SD consensus, and 
usually non-obvious), or abortive termination of an upstream 
ribosome-mRNA complex. Long mRNA concatamers can be 
generated from plasmid templates via 'rolling circle 
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transcription*. This may increase and/or stabilize the mRNA 
of the toxic allele, so that even rare translational initiation 
events can generate enough protein to impact cell viability 
negatively. 

Attempting to clone a toxic gene into a plasmid designed 
to facilitate high expression is, in many cases, futile. 
Transcriptional repressors are often employed to down- 
regulate expression, and typically act by interfering with 
productive transcription. This type of 'regulation is dependent 
upon: 1) the molar ratio of repressor protein to its cognate 
binding site (operator), and 2) the affinity of the repressor 
protein for the operator sequence. In no case is it reasonable 
to expect 100% of the operator sites to be occupied 100% of 
the time. Thus, some expression of a cloned gene is 
unavoidable, creating a powerful selective pressure against 
cells that faithfully replicate the lethal gene. Those cells in 
which expression of the toxic gene has been mutagenically 
inactivated survive. 

Genes encoding cytotoxic products must be actively and 
constitutively down-regulated, and any adventitious 
expression eliminated at both the transcriptional and 
translational levels. 
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This may be accomplished through the action of 
antisense RNAs (asRNA). The asRNA base pairs with a segment 
of mRNA and presumably inhibits translational initiation or 
elongation. The use of opposing promoters to modulate 
expression of a gene encoding a potentially toxic protein has 
been reported (O'Connor and Timmis, J. Bacieriol. 
169(10):4457-4462 (1987)). Their system employed the 
endogenous E. coli RNA polymerase ("RNAP"), with the sense 
RNA (sRNA) generated from the ^-derived P|_ promoter, and 
asRNA initiating at the E. coli P| a c promoter. Operator 
sequences for repressor proteins normally associated with 
these promoters, namely cl(857) and Lad, were also present 
on the high copy plasmid (pUC8/18) backbone. A second copy 
of the Lacl operator was inserted between P|_ and the gene of 
interest. The alleles encoding the cl857 and Lacl repressor 
proteins were not part of the plasmid, but were provided 
either from the chromosome {c!857X prophage) or on the low 
copy plasmid pACYC184 (lacl). 

This approach to cloning a cytotoxic gene, however, 
suffers from several shortcomings: 

1 ) a high copy replicon significantly raises the 
dosage of the toxic allele, increasing the likelihood for 
undesired expression; 

2) placement of operator sequences on a high copy 
replicon, while the genes encoding the repressor proteins are 
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present at substantially lower copy number, does not provide 
optimal repression; 

3) strong repression of gene expression and elective 
induction of gene expression are mutually exclusive. 

While the idea of using opposing promoters to modulate 
gene expression has been previously demonstrated (Elledge and 
Davis, Genes & Develop. 3:185-197 (1988)), it has not been 
demonstrated as a successful method using a toxic gene. The 
Elledge, et al. system relies upon conditional expression of a 
gene encoding spectinomycin resistance. This approach proved 
to be a useful genetic selection for genes encoding proteins 
capable of exhibiting transcriptional repressor-like activity 
(Elledge et al., PNAS USA 86:3689-3693 (1989); Dorner and 
Schildkraut, Nucl. Acid. Res. 22(6):1 068-1 074 (1994)). These 
studies showed that transcriptional inactivation of a gene can 
be achieved with an antisense promoter. 

It is imperative that stable clones of desired loci 
(including those encoding cytotoxic products) be established 
in the context of an inducible expression system, such as an E. 
coli expression system, for the following reasons: 

a) to generate a physical archive of single genes 
encoding potentially novel biochemical activities (as opposed 
to phage or cosmid constructs containing many genes); 
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b) to allow for rapid and facile characterization 
and/or manipulation of the entire allele; 

c) and to move rapidly from discovery to production. 

It would therefore be desirable to develop a method for 
cloning genes encoding cytotoxic products, including 
restriction endonucleases, or other genes which cannot be 
stably cloned by traditional methods, in order to enable the 
generation of the above-mentioned archive. 

Nonetheless, as a result of current cloning methods, more 
than 100 systems have been cloned and many have been 
sequenced (Wilson , G.G., Nucl. Acids. Res. 19:2539-2566 
(1991)). Several conclusions have emerged. First, genes for 
restriction endonucleases that recognize unique sequences are 
usually different from one another and their sequences are 
unique within GenBank. Typically, the only time when 
similarity has been found between restriction enzyme gene 
sequences is when the. two enzymes are isoschizomers or have 
closely related recognition sequences; i.e. they recognize 
exactly the same sequence, but come from different 
microorganisms (e.g. Lubys, A. et al., Gene 141:85-89 (1994); 
Withers, B.E. et al., Nucl. Acids. Res. 20:6267-6273 (1992)). 
Second, among methylase gene sequences there is very strong 
similarity between enzymes that form 5-methylcytosine 
(m5C), such that they can readily be recognized by pattern 
matching algorithms (Posfai, J. et al., Nucleic Acids. Res. 
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17:2421-2435 (1989); Lauster, R. et al., J. Mol. Biol. 206:305- 
312 (1989)). The genes for methylases that form N6- 
methyladenine (N6A) or N4-methylcytosine (N4C) are also 
related to one another, but show fewer well-conserved 
similarities. At least three subfamilies of sequences can be 
recognized (Wilson , G.G., Meth. Enzymol. 216:259-279 (1992), 
Timinskas et al. Gene 157: 3-11 (1995)). In this case, pattern 
matching algorithms do fairly well, but cannot provide 
conclusive evidence whether a newly sequenced gene encodes 
an N6A or an N4C methyltransferase. . Third, and most 
significant, for virtually all known RM systems that have so 
far been cloned, the methylase gene and the restriction enzyme 
gene lie either adjacent or extremely close to one another 
(Wilson , G.G., Nucl. Acids. Res. 19:2539-2566 (1991)). 

Within the last year, sequences have become available 
for many complete bacterial and archaeal genomes, including: 
Haemophilus influenzae (Fleischmann, R.D. et al., Science 
269:496-512 (1995)), Mycoplasma genitalia (Fraser, CM. et al., 
Science 270:397-403 (1995)), Meihanococcus jannaschii (Bult, 
C.J. etal., Science 273:1058-1073 (1996), Mycoplasma 
pneumoniae (Himmelreich, R. et al., Nucl. Acids. Res. 24:4420- 
4449 (1996)) and Synechocystis species (Kaneko, T. et al., DNA 
Res. 3:109-136 (1996) ). H. influenzae and M. jannaschii were 
each known to encode two Type II RM systems (Roberts, R.J. and 
Macelis, D.M., supra (1998)). The complete sequences of their 
genomes have revealed a remarkable fact. In each case, these 
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genomes appear to contain multiple RM systems many of which 
have never been detected biochemically. The results of 
computer analysis of these sequences is compared with the 
biochemical results shown in Table 1: 



TabOe 1! 



RM Systems 
Organisms 


RM Systems Dectected 
by Computer 


Detected Biochemically 


H. influenzae 


8 


2 


M. genitalia 


2 


not tested 


M. jannaschii 


12 


2 


M. pneumoniae 


4 


not tested 


Synechocystis species 4 


not tested 



As mentioned earlier, among Type II restriction enzymes 
there are now more than two hundred different specificities 
present. Table 2 shows the kind of sequence patterns that are 
currently known to be recognized by restriction endonucleases. 
It lists the number of specific examples of each presently in 
the database, compared with the theoretical number based on 
all possible sequence combinations. 

In column 1 of this table, the pattern representation, n', 
signifies the complement of n. Thus nnn'n' in the first entry is 
used to represent the 16 possible tetranucleotide palindromes 
AATT, ACGT, AGCT etc. 
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It is clear that for some types of patterns, such as the 
simple hexanucleotide and tetranucleotide palindromes, we are 
very close to having all possible such enzymes. However, for 
many of the other patterns we are a long way away from the 
theoretically possible number. This suggests that there are 
many more specificities waiting to. be discovered. 

Accordingly, it would be desirable to provide an 
alternative method for screening for restriction endonucleases 
which would overcome the limitations associated with the 
traditional biochemical methods described above. Such an 
alternative method would facilitate the identification, 
characterization, and cloning of heretofore unknown 
restriction endonucleases as well as isoschizomers of known 
restriction endonucleases. 
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Table 2 



Sequence patterns recognized by Type II restriction enzymes 



Specific Example 



Pattern 


Rec. Sequence 


Enzyme 


Observed 


Possible 


nnn'n 1 


AATT 


TspE\ 


14 


16 


nrrhnWrV 


AACGTT 


Ac/I ' 


55 


64 


nnnnn'n'n'n 1 


ATTTAAAT 


Swa\ 


9 


256 


nnnnn 


ACQGC 


Bceil 


18 


1024 


nnnnnn 


ACCTGC 


BspUl 


25 


4096 


nnNn'n' 


ACNGT 


7sp4CI 


7 


16 


nDnn'Hn' 


GDGCHC 


SdiA 


1 


16 


nKnnnn 


GKGCCC 


Bmg\ 


1 


1024 


nMnn'Kn* 


CMGCKG 


A/spBil 


1 


16 


nnBNNNNNVn'n' 


GABNNNNNVTC 


HinAl 


1 


16 


nnMKn'n' 


GTMKAC 


Acc\ 


1 


16 


nnnn 


OOGC 


Aci\ 


2 


256 


nnNNn'n' 


OOM3G 


Sed 


3 


16 


nnnNn'n'n* 


CCTNAGG 


Saul 


3 


64 


nnnNnnn 


CACCTGC 


UbaE\ 


3 


4096 


nnnNNNn'n'n' 


CACNNNGTG 


Dra\\\ 


3 


64 


nnnNNNNnW 


GAANNNNTTC 


Xmn\ 


3 


64 


nnnNNNNNnW 


CCANNNNMTGG 


Pf/MI . 


6 


64 


nnNNNNNNNn'n* 




Bs/YI 


2 


16 


nnnNNNNNNnW 


ACCNNNP^MaGT 


HgEW 


3 


64 


nnnnNNNNNn'nW 


GGOCNfsWNGGX 


sm 


1 


256 


nnNIMNNNnnnn 


ACNNNNNCTOC 


SsaXI 


2 


4096 


nnnNNNNNNNm 




Beg* 


3 


1024 


nnnnNNNNNNnnn 


GAACNNNNNWTCC 


UbsDl 


1 


16384 


nnnNNNNNNNNNn'n'n' 


(XANNNNM^NNTGG 


Xcm\ 


1 


64 


nnNNNNnnnYn 


ACNNNNGTAYC 


Bae\ 


1 


4096 


nnnRnn 


CAARCA 




2 


1024 


nnnWn'n'n* 


ACCWGGT 


SexAI 


4 


64 


nnRYn'nV 


ACRYGT 


Afl\\\ 


4 


16 


nnSn'n* 


CCSGG 


CaiAl 


3 


16 


nnWn'n 1 


OCWGG 


EccRU 


4 


16 


nnWWn'n' 


OCVWVGG 


Sty\ 


1 


16 


nnYNNNNRn'n' 


CAYNNMNRTG 


Msn 


1 


16 


nnYRrVn' 


CTYRAG 


Sml\ 


3 


16 


nRniVYn' 


GRCGYC 


Acy\ 


2 


16 


nRnnn'n'Yn 1 


CRCCGGYG 


SgrAl 


1 


64 


nWnnW 


GWGCWC 


Hg/AI 


1 


16 


nYnn'Rn' 


CYCGRG 


Ava\ 


1 


16 


Rnn'Y 


RGCY 


CviJ\ 


1 


4 


Rnnn'n'Y 


RAATTY 


Apo\ 


5 


16 


RnnNn'n'Y 


RGGNCCY 


DraW 


1 


16 


RnnWn'n'Y 


RGGWCCY 


PpuMI 


1 


16 


Wnnn'n'W 


WCCGGW 


Bet\ 


3 


16 


Ynnn'n'R 


YACGTR 


BsaA\ 


2 


16 


Ynnnnn 


CGGOCR 


Gdi\\ 


1 


1024 
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SUMMARY OF THE INVENTION 

In accordance with one embodiment of the present 
invention, a novel method for screening for restriction 
endonucleases is provided. This method has been successfully 
employed and may be used to identify heretofore unknown 
restriction endonucleases as well as isoschizomers of known 
restriction endonucleases, such isoschizomers possessing a 
desired physical property, such as thermostability. This novel 
method will also facilitate the characterization, cloning and 
production of newly identified restriction endonucleases and 
isoschizomers. 

More specifically, in its broadest application the present 
invention comprises the following steps: 

(a) screening a target DNA sequence for the presence 
of known DNA methylase sequences and motifs 
characteristic of DNA methylases; 

(b) identifying open reading frames which lie close to 
the DNA methylase sequence of step (a); and 

(c) analyzing the protein product of the open reading 
frame of step (b) for endonuclease activity. 

Once a new restriction endonuclease or isoschizomer has 
been identified in accordance with the above-outlined 
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methodology, the restriction endonuclease so identified may be 
produced in accordance with standard protein purification 
techniques or by recombinant DNA techniques. 

Several novel restriction endonucleases isolated from M. 
jannaschii using the methods of the present invention are also 
provided, including Myall, which is a thermostable 
isoschizomer of Sau96l, Myalll, which is a thermostable 
isoschizomer of Mbo\, and My'alV, a new specificity recognizing 
GTNNAC. 

Also provided by the present invention is a novel method 
for stably cloning DNA sequences which might otherwise be 
unstable because the products encoded are toxic. One example 
provided has a stable, inducible clone encoding the normally 
toxic restriction endonuclease Pad in the absence of a 
protective methylase. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows the agarose gel electrophoresis of DNAs 
digested by the transcription/translation product of the 
MJ0984 open reading frame from M. jannaschii and Bfa\ 
(recognition sequence CTAG). Lane 7:BsfNI/pBR322 markers; 
Lane 2: bacteriophage X DNA digested with Bfa\; Lane 3: double 
digest of bacteriophage X DNA with Bfal and the 
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transcription/translation product from MJ0984; Lane 4: 
bacteriophage X DNA digested with the 
transcription/translation product from MJ0984; Lane 5: 
H/ndlll/bacteriophage X DNA markers. 

Figure 2' is the agarose gel electrophoresis of R.Sf/'l 
activity in coupled transcription/translation reactions. Sfi\ 
digests of Adenovirus-2 DNA (35,927 bp) were carried out as 
described in the text. Lane 1: Uncut DNA. Lane 2: DNA digested 
with 10 units purified Sfi\ (NEB). Lanes 3-7: DNA digested with 
serially diluted reaction supernatant of in vitro 
transcription/translation reaction without added T7 RNA 
polymerase. Lanes 8-12: DNA digested with serially diluted 
reaction supernatant of in vitro transcription/translation 
reaction with added T7 RNA polymerase. Lanes 3 & 8: 3 u.l 
reaction supernatant. Lanes 4 & 9: 1uJ reaction supernatant. 
Lanes 5 & 10: 0.3 u.l reaction supernatant. Lanes 6 & 11: 0.1 uJ 
reaction supernatant. Lanes 7 & 12: 0.03 \i\ reaction 
supernatant. The expected sizes of products from a complete 
Sf/I digestion are 16,284, 12,891, 5,739 and 1,023 bp. 

Figure 3 is a diagram depicting the vector pLT7K used 
in the stable cloning of genes encoding cytotoxic proteins of 
Examples IX and X. 
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nFTAILED DESCRIPTION OF THE INVENTION 

In accordance with one preferred embodiment of the 
present invention, there is provided a novel method for 
identifying a restriction endonuclease. The first step of this 
method is to compile a database of »DN A sequences that .encode 
either a DNA methylase or a restriction enzyme. This can be 
accomplished by searching GenBank for coding sequences that 
carry the annotation "methylase", "methyltransferase", 
"modification methylase", "restriction endonuclease" or 
"restriction enzyme". All such sequences are collected and 
used as the master database of restriction enzyme and 
methylase gene sequences, the "RM sequence database". If 
desired, and if available, then other DNA sequences known to 
encode DNA methylases or restriction endonucleases, not 
present in GenBank, can be included in this master collection. 

The second step is to take the new target sequence, say 
that of a bacterial genome, and compare each, open reading 
frame present in that sequence against the RM sequence 
database. Preferably, this is accomplished using the program 
BLAST (Altschul, S.F., Gish, W., Miller, W. f Myers, E.W. and 
Lipman, D. J. Mol. Biol. 215: 403-410 (1990)) or other 
comparable searching routines, such as FASTA (Pearson, W. and 
Lipman, D. Proc. Natl. Acad. Sci. USA 85: 2444-2448 (1988)). 
Each time that a significant match is found between an open 
reading frame in the target sequence and a known gene in the 
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RM sequence database, it is examined more carefully. If the 
match is to a restriction endonuclease gene, then that open 
reading frame is likely to encode a restriction enzyme and it 
can be investigated biochemically as detailed below. Where 
the matches are to DNA methylase genes the matches are 
examined to see if the short sequence motifs characteristic of 
cytosine-5 methylases (Posfai, J. et al., Nucleic Acids. Res. 
17:2421-2435 (1989); Lauster, R. et al., J. Mol. Biol. 206:305- 
312 (1989)) or those characteristic of N4C- or N6A- 
methylases (Wilson , G.G., Meth. Enzymol. 216:259-279 (1992), 
Timinskas et al. Gene 157: 3-11 (1995) ) are present. If they 
are, then it is concluded that the new open reading frame in the 
target sequence is likely to encode a DNA methylase. Because 
DNA methylases and their cognate restriction endonucleases 
have usually been found to be encoded close to one another 
(Wilson , G.G., Nucl. Acids. Res. 19:2539-2566 (1991)), it is of 
particular interest to examine the open reading frames that 
flank this methylase gene to see if they can be considered, new 
restriction enzyme gene candidates. 

The open reading frames that flank the newly identified 
methylase gene are preferably first checked to see if they have 
homologs in the RM sequence database. If one shows even 
weak similarity to a known restriction enzyme gene, then it is 
considered to be a prime candidate to encode a new restriction 
endonuclease of the same specificity and it can be 
characterized biochemically as described below. If the 
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flanking sequences show no similarity to any sequence in the 
RM sequence database, then they are compared with the entire 
GenBank database to see if a match can be detected to some 
other sequence. Again BLAST can be used for this purpose. If 
they show a match to some gene of known function, that is not 
a methylase or a restriction enzyme, then they can be 
eliminated as a prime candidate for the restriction enzyme 
gene, although it cannot be rigorously excluded in the absence 
of direct biochemical evidence. If both flanking genes have 
good matches in GenBank, then the original methylase gene is 
considered to be an orphan methylase (i.e., a methylase which 
is not associated with a cognate restriction endonuclease) that 
does not form part of a restriction/modification system. In 
some instances, however, (see, Example VI), the restriction 
endonuclease gene may be separated from its cognate 
methylase by an* intervening ORF, thus necessitating analysis 
of ORFs upstream and downstream from the ORF flanking the 
methylase gene. In this situation, adjacent ORFs greater tpn 
about 100 amino acids (approximately 300 nucleotides) should 
be examined. If this does not yield any candidate genes, the 
examination should continue upstream and downstream to the 
next ORF of greater than about 100 amino acids. This process 
should continue up to about 3 kb on either side of the 
methylase gene. If either one or both flanking open reading 
frames are unique (i.e. have no homologs in GenBank) then they 
become candidates for new restriction enzyme genes. 
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Once an open reading frame has been identified that is a 
candidate for a restriction enzyme gene, a purely in vitro 
procedure is preferably used to prepare a small sample of the 
protein product of that open reading frame followed by testing 
of the protein product for restriction enzyme activity, again in 
vitro. In one preferred embodiment, whole genomic DNA from 
the microorganism is prepared, and two PCR primers are 
synthesized. One primer corresponds to a region that lies 
downstream (3') of the stop codon of the open reading frame, 
contains about 20 nucleotides complementary to the coding 
strand and an additional 10-15 nucleotides that contain a 
restriction enzyme recognition site not found in the gene 
itself, in case later cloning is required. This primer which is 
typically 30-35 nucleotides long, and is designed to copy the 
non-coding strand. 

The second primer is designed to produce the coding 
strand. This second primer contains, close to its 5' prime end, 
a restriction enzyme recognition site not found in the gene, 
followed by a promoter site for a polymerase such as 17 RNA 
polymerase, a ribosome binding site appropriate for the 
translation system being used in the later step, and positioned 
so that translation will begin with the first start codon of the 
open reading frame that is the candidate for the restriction 
enzyme gene. Typically, about 20 additional nucleotides are 
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present at the 3' end of this primer that correspond to the 
first few codons of the open reading frame. 

These two primers are then used in a standard 
amplification procedure such as the polymerase chain reaction 
(PCR) so that a linear piece of DNA is produced, which contains 
a T7 promoter, a ribosome binding site, and the complete open 
reading frame that is the candidate for the restriction enzyme 
gene. This PCR product is used as a template for transcription 
in vitro by T7 RNA polymerase. This results in the production 
of a large amount of RNA containing the complete coding 
sequence for the candidate open reading frame. Either with or 
without further purification the RNA template produced is then 
used as a template for translation in vitro using a standard 
commercial translation system. 

One preferred method of assaying for the presence of 
endonuclease activity is in vitro transcription-translation 
using the rabbit reticulocyte system. Another preferred 
method of assaying for such endonuclease activity is the E. coli 
S-30 transcription-translation system. 

In accordance with the present invention, it has been 
found that a particularly preferred method for assaying for 
thermophilic endonuclease activity is the wheat germ based 
translational system. 
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When assaying for endonuclease activity it is often 
necessary to incubate the translation product and substrate 
DNA at a temperature that mimics the normal living conditions 
of the organism from which the gene originated. When 
assaying a translation product of an ORF that was amplified 
from a thermophilic organism's genomic DNA the assay is 
usually incubated at temperatures ranging from 50°C to 80°C. 
It was found that at temperatures above 50°C the reticulocyte 
translational mix begins to congeal and endonuclease activity 
is hard to detect. Although thermophilic endonucleases have 
been identified using reticulocyte based translations, the 
wheat germ translation mix does not congeal when heated in 
the same way and hence is a more practical assay particularly 
for thermophilic endonucleases. 

Following translation, during which time a small amount 
of the protein product from the candidate open reading frame 
will have been produced, the entire translation mix is assayed 
for the presence of the restriction enzyme using well 
established techniques. (Schildkraut, "Screening for and 
Characterizing Restriction Endonucleases", in Genetic 
Engineering, Principals and Methods, Vol 6, pp. 117-140, 
Plenum Press (1984)). This may be accomplished, for example, 
by taking a small portion of the translation mix and incubating 
it with several substrate DNAs such as those from 
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bacteriophage X, bacteriophage T7, Adenovirus-2, etc. that are 
likely to contain one or more recognition sites for the 
restriction enzyme. Typically, the assays are allowed to run 
from 30 minutes to 16 hours. The whole mix is then applied to 
an agarose gel where DNA fragments separate according to 
size. If a restriction enzyme is present in the translation mix, 
then usually that restriction enzyme will cleave one of the 
test substrate DNAs, leading to the banding pattern that is 
typical of restriction endonucleases. If bands are detected, 
then the specificity of the restriction enzyme can be 
determined using standard procedures. (Schildkraut, supra 
(1984)). 

Another preferred method for identifying the restriction 
enzyme encoded by a candidate gene involves first cloning the 
candidate open reading frame, together with its adjacent 
methylase gene into an appropriate host cell such as E. coli. 
For this purpose, PCR primers may be chosen so as to amplify 
the complete coding sequences for both methyiase and 
restriction enzyme genes. These may be placed into a standard 
expression vector such as pUC19, and the resulting 
transformants would be tested for restriction endonuclease 
using standard procedures. Briefly, a small sample of each 
clone is grown. The cells may be harvested and sonicated to 
prepare a crude cell lysate. Following centrifugation to 
remove cell debris, the supernatant may be tested for 
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restriction endonuclease activity by incubation of small 
samples with various DNAs as described above. 

It is conceivable that either the methylase gene and/or 
the endonuclease gene might be lethal in the host cell, in which 
case the frequency of transformants from the PCR product, 
would be abnormally low. In those circumstances, another 
approach is possible. Specifically, PCR may be used to 
amplify the methylase gene in the absence of its flanking 
sequences, and this gene may be cloned into an appropriate 
host cell such as E. coll In this case, the transformants may 
be tested for methylase activity using a standard assay in . 
which a crude extract from the clone and an appropriate DNA 
substrate such as those from bacteriophage -A, bacteriophage 
T7, Adenovirus-2, etc. would be incubated with [ 3 H]-S- 
adenosylmethionine. The incorporation of f H] into DNA may 
then be monitored by scintillation counting. The successful 
cloning of an active methylase gene may be detected if the 
crude extract can transfer 3 H counts into DNA. If methylase 
clones are successfully obtained, then such clones may be 
expected to protect the host E. coli DNA against the possible 
deleterious action of the restriction endonuclease. An 
appropriate host cell which harbors the methylase clone may 
then be used as a recipient in a second cloning experiment, to 
obtain the endonuclease gene. This may be obtained by its 
amplification by PCR and cloning into a second compatible 
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vector plasmid. As before, transformants may be tested for 
the presence of active restriction endonuclease. 

The present invention also relates to multipurpose 
cloning vectors and their use in cloning and/or in vitro and/or 
in vivo transcription and/or translation of nucleic acid 
segments that may be cytotoxic and/or may produce cytotoxic 
products. 

(1) A nucleic acid segment constituting an ORF is 
isolated and/or acquired by standard molecular biological 
methods. This may be undertaken so as to either maintain, or 
selectively alter the native sequence context of the coding 
region. The native sequence of the first (ATG, GTG, or TTG), or 
last (TAA, TAG, TGA) codon may be maintained or selectively 
altered in order to modulate translational efficiency, and/or 
provide for translation fusion. 

In a preferred method, a nucleic acid segment (ORF) is 
recombined by standard molecular cloning techniques into a 
plasmid having the following properties: 

i) oppositely oriented (opposing) transcriptional 
promoters, providing for sense-, anti-sense, and/or 
bidirectional transcription, flanking the inserted DNA 
segment. Preferably, the promoters will be cognate 
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substrates for nonidentical RNAPs, and will not functionally 
substitute for RNAPs for which they are not cognate 
substrates. To provide for transcription of a particular strand 
of the inserted DNA segment, the vector preferably possesses 
a promoter that is a substrate for a host cell RNAP, such as 
the E. colia70 RNAP promoter, XP L . In addition, to provide for 
transcription of the complementary strand of the inserted DNA 
segment, the vector of the present invention preferably 
possesses a promoter that is a substrate for a non-host cell 
RNAP, such as bacteriophage T7 RNAF promoter, P^. 

ii) the opposing promoters will contain sequences 
(operators) providing for binding of transcriptional repressor 
proteins (repressors). Preferably, the operators will be 
cognate ligands for nonidentical repressors, and will not 
functionally substitute for repressors for which they are not 
cognate ligands. To provide for transcriptional repression of a 
particular sequence of the inserted DNA segment, the vector 
preferably possesses an operator, such as O lac , that is a ligand 
for a repressor such as E. coli Lacl. In addition, to provide for 
transcriptional repression of the complementary sequence of 
the inserted DNA segment, the vector preferably possesses an 
operator, such as O cl , that is a ligand for a repressor, such as 
bacteriophage A,cl857. 
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iii) to modify the degree of transcription of a 
particular sequence of the inserted DNA segment, the cognate 
operatonrepressor binding interactions may be selectively and 
independently manipulated. Preferably, conditions that affect 
one operatonrepressor binding interaction, will not detectably 
affect the other, and vice versa. To alleviate transcriptional 
repression via destabilization of an operatonrepressor binding 
interaction, such as O lac :P T7 , a synthetic chemical compound, 
such as isopropyl-thio-B-D-galactopyranoside (IPTG) is used. 
In addition, to alleviate transcriptional repression via 
destabilization of an operatonrepressor binding interaction, 
such as O c ,:cl857, permissive and non-permissive 
temperatures are employed. 

iv) to provide for its selective maintenance in 
cultured cells, the -vector preferably possesses a genetic 
element specifying an antibiotic resistance phenotype, such as 
a (3-lactamase allele. 

v) to provide for the persistence of a desired 
embodiment, the vector preferably possesses genetic 
elements capable of directing its episomal and/or 
intrachromosomal replication, such as the replicative origin 
of pBR322 (Bolivar, et al., Gene, 2:95-113 (1977)). 
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One especially preferred plasmid is pLT7K (see Figure 3). 
The segment encoding replicative functions (encoded by rop 
and ori) is derived from pBR322 (Bolivar, et al., supra (1977)). 
The gene encoding B-lactamase {bla) confers ampicillin 
resistance, and has been' altered to remove a recognition site 
for the Pst\ restriction endonuclease. The gene encoding 
kanamycin resistance is flanked by restriction sites suitable 
for cloning. The c/857 gene encodes a mutant form of the 
repressor protein, cl857 (Horiuchi and Inokuchi, Journal of 
Molecular Biology, 23(2):217-224 (1967)). The cl857 protein 
conditionally binds to DNA sequences (the cl operators, or O cl ) 
that overlap P L and P R (bacteriophage X major leftward and 
rightward promoters, respectively). The lad gene encodes a 
repressor protein, Lacl. that conditionally binds a DNA 
sequence (the lac operator, or O lae ) which has been constructed 
to overlap (bacteriophage T7 RNA polymerase 
transcriptional promoter). The segment containing X cl857, P L 
and P R was subcloned from the pGW7 (Geoffrey Wilson, New 
England Biolabs, Inc.) derivative, pJIH1 (gift of R.E. Webster, 
Duke University). In the former construction, P L and P R are 
divergent and separated by cl857, as illustrated: 



<~P H ~cl857">P L ~>. 



In the latter construction, the segment containing c/857and P R 
was inverted relative to P L , as illustrated: 
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<--c/857--P B -->--P L ->. 

In the context of pLT7K and its derivatives, this arrangement 
of the bacteriophage XP L and P R promoters has been designated 
Pur- a " °* tne 'genetic elements mentioned above are 
specified by sequences present on the plasmid. Transcription 
from P UR proceeds towards P T7 , whereas transcription from P T7 
proceeds towards P UR . Transcription from P^ is dependent 
upon the endogenous E. coli RNA polymerase, whereas from P T7 
it is dependent upon expression of an RNAP derived from 
bacteriophage T7. 

(2) The resulting construction is transformed into an 
appropriate host cell such as E coli strain under conditions 
intended to disallow undesired expression of the insert DNA, 
as specified in Example IX. Transformants are randomly 
selected for small-scale plasmid DNA preparation. The 
plasmid DNA is analyzed by restriction enzyme digestion for a 
banding pattern consistent with the desired clone. A sampling 
of clones exhibiting the appropriate restriction pattern is 
sequenced across the insertion site and compared to the 
original database entry. 

Clones that pass these examinations are transformed 
into an E. coli strain carrying two distinct RNAPs whose 
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relative transcriptional efficiency can be simultaneously and 
independently modulated in an elective manner. 
Transformation and colony selection are carried out as above. 
Selected colonies are grown in liquid culture conditions 
intended to disallow expression of the insert DNA. Culture 
conditions may be subsequently altered so as to favor 
expression of the insert DNA. (See, e.g., Example IX.) 

In a particularly preferred embodiment, the cl857 
protein, which is a temperature sensitive mutant of the cl 
repressor, is used to control P^-directed transcription by the 
host RNAP. The degree of 0 C | occupation by cl857 can be 
modulated by the temperature of the bacterial culture 
conditions. At ~30°C (permissive temperature), c!857 can 
bind O c | and effectively repress transcription from P\JR. 
However at ~37°C (non-permissive), c!857 cannot stably bind, 
and transcription from PUR by the host RNAP is enabled. 

In one preferred embodiment, a plasmid host strain 
carrying genetic elements allowing for elective induction of 
an exogenous RNAP, such as E. coli strain ER2566 is used. 
ER2566 carries a gene encoding T7 RNAP {T7g1) inserted into 
the chromosomal lacZ locus, expression of which is repressed 
by Lad Addition of IPTG to an ER2566-pLT7(x) (wherein "x" 
designates a specific construction derived from pLT7K) 
culture will: (1) alleviate Lad mediated repression of the lac 
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operon and promote expression of T7 RNAP by the host RNAP; 
and (2) alleviate Lad occupation of the plasmid-borne O lac 
site, thereby enabling transcription from by T7 RNAP. 

The O lac : Lac\ interaction is not significantly affected by 
temperature, nor is the O cl : cl857 interaction affected by the 
presence of IPTG. In the most preferred embodiment, 
operatonrepressor interactions such as these can be 
simultaneously and independently manipulated, subsequently 
affecting transcriptional efficiency from respective 
promoters, such as P^ and P UR . Since the DNA sequences 
encoding the repressor proteins and operators are in cis, the 
molar ratio of repressor alleles to their respective operator 
sites is essentially equivalent to their normal chromosomal 
ratio. Thus, one may expect the desired repressor : operator 
interactions to quantitatively reflect wild-type interactions. 

The location and relative orientation of the plasmid- 
borne repressor alleles enables very tight regulation of 
expression from the desired promoter. For instance, in the 
unanticipated event that Lad levels drop below some critical 
threshold for Olac occupation (under culture conditions 
intended to favor expression from Pl/R. but not PT7). tec/ 
could be expressed by virtue of readthrough transcription 
originating from Pl/R, in addition to its own promoter. This 
would increase the level of lad transcript and concomitant 
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expression of the Lacl repressor protein. The same scenario 
applies to cl857 expression from Pj7 (see Figure 3)- Thus, 
strong positive regulation of the desired repressor : operator 
interaction has been built in to the system. 

If desired, expression can be further controlled by either 
eliminating, or independently inhibiting either of the RNAPs. 
T7 RNAP, for example, can be physically excluded by using an 
E. coli strain that does not encode it. If a T7 RNAP allele is 
present, its adventitious expression can be mitigated by 
including a plasmid encoding coliphage T7 lysozyme, such as 
pLysP (gift of W.F. Studier, Brookhaven National Laboratory). 
T7 lysozyme interacts stoichiometrically with T7 RNAP and 
prevents the polymerase from effectively extending 
transcripts from PT7. Addition of IPTG to the culture medium 

decreases the generation of sufficient amounts of T7 RNAP to 
overcome inhibition by T7 lysozyme. E. coli RNAP can be 
inhibited by the addition of the antibiotic rifampicin to the 
culture medium. T7 RNAP is not sensitive to rifampicin, nor 
is £ coli RNAP known to be affected by either IPTG or T7 
lysozyme. Thus, transcription catalyzed by the respective 
RNAPs can be simultaneously and independently modulated. 

(3) Cultures are harvested by centrifugation and 
sonicated to produce a crude lysate. Following centrifugation 
to remove cell debris, the supernatant is tested for 
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biochemical activity, as appropriate. In the case of 
restriction endonuclease activity, the supernatant is 
incubated with various DNAs as described above. 

The stabilization of a nucleic acid segment in a vector 
such as pl_T7K allows for sequence verification, mutagenesis, 
and expression. This solves the following shortcomings of in 
vitro transcription/translation (txu/tln) of a comparatively 
ephemeral PCR product: 

(a) A negative result cannot be unambiguously 
interpreted as the absence of a desired biochemical activity 
because: i) the lack of an internal positive control precludes 
discrimination between a technical failure and no activity; ii) 
the protein may not be sufficiently stable to survive the 
assay; iii) the protein may not be produced in sufficient 
quantity to generate a detectable signal in the assay; iv) the 
protein may not have sufficient specific activity to generate a 
detectable signal in the assay; v) the protein may not be 
active in the txn/tln extract; vi) there may be inactivating 
mutations in the genomic DNA from which the candidate PCR 
product is amplified, and; vii) propagation of early PCR errors 
may negatively affect signal detection in the assay; 

(b) The PCR product is consumed as a function of the 
assay and must be regenerated as needed, with two significant 
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consequences: i) it consumes genomic DNA (the source of all 
the candidate loci), which can be problematic if the DNA is 
difficult to obtain as has been the case for Methanococcus 
jannaschii; ii) more importantly, the candidate ORF may not 
yield detectable < activity < because of the accumulation of one 
or more down mutations in its nucleic acid sequence. Even if 
such a mutation is identified, it may only be mutable if within 
the sequence encompassed by the PCR primers. If outside this 
region, ORF sequences are essentially immutable, and the gene 
product, if any, cannot be biochemically characterized with 
this approach. 

In yet another preferred embodiment of the present 
invention, the original microorganism from which the DNA 
sequence has been obtained may be grown up, crude extracts 
prepared, and tested for restriction endonuclease activity in 
the usual way described above. In the event that restriction 
endonuclease activity is found, then it may be related to the 
gene coding for it in several ways. First, if methylase clones 
are active, then they may be tested directly to see if the DNA 
from the cloned plasmid is resistant to the action of the 
restriction endonuclease, suggesting that they have matching 
specificities and so form part of the same restriction- 
modification system. Alternatively, the endonuclease may be 
purified to homogeneity, some N-terminal or other protein 
sequence obtained, and the protein sequence compared with the 
predicted protein sequence from the original sequenced gene. 
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The following Examples are given to illustrate 
embodiments of the present invention, as it is presently 
preferred to practice. It will be understood that these 
Examples are illustrative, and that the invention described 
herein is not to be considered as restricted thereto except as 
indicated in the appended claims. 

The references cited above and below are herein 
incorporated by reference. 

EXAMPLE I 

M/aJ Restriction Endonuclease: 

The restriction endonuclease Mja\, from Methanococcus 
jannaschii, has previously been characterized biochemically 
and shown to recognize the sequence CTAG (Zerler, B., Myers, 
P.A., Escalante, H. and Roberts, R.J. cited in REBASE - see 
Roberts, R.J. and Macelis, D. Nucl. Acids Res. 26: 338-350 
(1998)), but the gene had not been cloned. With the recent 
determination of the complete sequence of the M. jannaschii 
genome (Bult et al. Science 273: 1058-1073 (1996)) the 
sequence was searched using the BLAST program (Altschul, et 
al. J. Mol. Biol. 215: 403-410 (1990)) to identify candidate 
restriction enzyme and methylase genes. In brief, all open 
reading frames in the sequence were compared with the RM 
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sequence database that contained the published sequences of 
all DNA methylases and restriction endonucleases that had 
been compiled from entries in GenBank. Each match against an 
entry in this database was recorded and the corresponding 
region of the M. jannaschii genome was examined to determine 
if the hit could be part of a restriction-modification system. 
Typically, most good hits were between a known DNA 
methylase gene and an open reading frame present in the M. 
jannaschii genome. 

By using BLAST it was found that one open reading frame 
(MJ0985) showed great similarity to a known DNA methylase 
• gene, encoding M.MrrtZI, a methylase which forms part of a 
restriction-modification system in Methanobacterium 
thermoformicicum that recognizes the sequence CTAG 
(Nolling, J. and deVos, W.M., Nucl. Acids Res. 20: 5047-5052, 
(1992); Nolling, J., Van Eeden, et al., Nucl. Acids Res. 20: 6501- 
6507 (1992)). The regions of similarity included the motifs 
characteristic of an N4C- or N6A-methylase (Wilson, G.G., 
Meth. Enzymol.. 216:259-279 (1992), Timinskas et ai. Gene 
157:3-11 (1995)). Immediately adjacent to this M. jannaschii 
putative methylase gene was another open reading frame, 
MJ0984, that resembled the gene encoding the restriction 
enzyme MthZl This open reading frame, which had never 
previously been investigated biochemically, was tested for its 
coding potential using the method disclosed in accordance with 
the present application. This Example documents the 
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identification of an active restriction endonuclease from a 
previously unknown DNA sequence. 

DMA from M. jannaschii, was a gift from G. Olson, 
University of Illinois, Urbana. The open reading frame, 
MJ0984, predicted to encode the Mja\ restriction endonuclease 
comprised residues 4687-5355 of the GenBank entry U67541. 

Primers were selected with the following sequences: 

5'-pGTTTAATACGACTCACTATAGGGTTAGGAGGTATTACAT 
(A)TGGTGAMCTTATGAAAAAATTG-3' (SEQ ID NO:1) 

Note that the marked (A) is a G in the original genome. It 
was changed to an A to ensure a better translational start. 
This is the start codon of the open reading frame. Sequences 
preceding the (A) are not present in the genome, but contain 
the T7 RNA polymerase promoter sequence and a good ribosome 
binding site. 

5'-pGTTGGATCCGCAAAAMGAATAGGAATGGATTTTAATG-3' 
(SEQ ID NO:2) 

These primers were first used to prepare an amplified 
sample of the region of the M. jannaschii genome containing the 
MJ0984 open reading frame. The MJ0984 open reading frame 
was amplified from genomic M. jannaschii DNA in three PCR 
reactions (80 nl each) that contained 0.4 mM each of the four 
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dNTPs, 0.02 \lq M. jannaschii genomic DNA, 0.4 \iM primer 1, 0.4 
H.M primer 2, 1.2 units Vent® DNA polymerase and either 3 mM, 
4.5 mM or 6 mM MgS0 4 in 1X NEB ThermoPol buffer. The reaction 
was heated to 95°C for three minutes, and then 5 cycles of 
amplification at 95°C for 30 seconds, followed by 52°C for 30 
seconds, followed by 72°C for 45 seconds were performed, 
followed by 20 cycles at 95°C for 30 seconds, 62°C for 30 
seconds and 72°C for 45 seconds. 10 p.l of each PCR reaction 
was analyzed by gel electrophoresis, and a prominent band of 
the expected size was observed in the 4.5 mM and 6 mM MgS0 4 
reactions. These two reactions were combined, extracted with 
phenol/chloroform, washed in an Amicon Microcon-100 
microfiltration device by four serial 20-fold dilution and 
concentration steps into TE buffer and the final 40 \i\ of 
concentrated product was stored at 4°C. 

The same primers, 1 and 2, were then used in a set of 24 
PCR reactions (100 u.l each) that contained 0.8 mM each of the 
four dNTPs, 0.01 jig pre-amplified M. jannaschii DNA described 
above, 0.5 nM primer 1, 0.5 uM primer 2, and 2 units Vent® 
DNA polymerase (New England Biolabs, Inc., Beverly, MA) in 1 x 
NEB ThermoPol Buffer. The reaction mix was heated at 95*C 
for three minutes, and then subjected to 25 rounds of PCR, 
incubating at 95°C for 30 seconds, 46°C for 30 seconds, 72°C 
for 50 seconds. Finally the reaction was incubated at 30°C for 
two minutes. The crude mixture from the PCR reactions was 
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then combined and purified. First a standard 
phenol/chloroform extraction was carried out to remove 
protein and the DNA was precipitated with isopropanol and 
then spun at 9,000 rpm for 7 mins in the microfuge through 
Microcon 50 filters. The concentrated PCR product 300 jxg/ml 
was collected at 2,000 rpm for 5 min. The product was checked 
on a 1 % agarose gel. 

The transcription and translation of the putative Mja\ 
gene was performed using a rabbit reticulocyte Protein 
Truncation Kit (Boehringer Mannheim). The PCR product 0.4 u.g 
(2 u.l), transcription mix (2.5 and 5.5 \l\ of RNase free 
water were incubated at 30°C for 30 min. The translation mix 
(40 ul) was added and incubated at 30°C for 1 hr. The 
transcription/translation mix was then tested for newly- 
formed restriction enzyme activity corresponding to the 
formation of Mja\ . 

Serial dilutions were performed by mixing 2ul, "lu.1, 
0.5u.l, 0.25|il translation product per 20 ul final reaction 
volume in "IX NEB buffer 4 (50 mM potassium acetate, 20 mM 
Tris-acetate, 10 mM Magnesium acetate, 1 mM dithiothreitol, 
100 ng/ml BSA) containing 25 n.g/ml substrate DNA. The 
reactions were incubated at 37°C overnight. The reactions 
were run on a 1.0 % agarose gel. As a positive control Bfa I (20 
units, New England Biolabs, Inc.), an isoschizomer of Mja\, was 
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used to cut the substrate DNA under the same reaction 
conditions. As a negative control the DNA was incubated with 
the transcription/translation mix to which no template DNA 
(PCR product) had been added. 

The ; agarose gel results showed that the test DNA was 
digested by the translation/transcription mix only when that 
mix had been primed with PCR product from the putative Mja\- 
encoding plasmid DNA. The banding pattern produced was 
identical to that produced by Bfa\ (Figure 1, lanes 2 and 4). A 
double digest between Mja\ and Bfa\ gave no additional bands 
(Figure 1, lane 3). These results allow the identification of 
the open reading frame present in the starting plasmid as 
encoding Mja\ restriction endonuclease. 

EXAMPLE n 

Hhal Restriction Endonuclease: 

The genes encoding the restriction endonuclease and 
methytase of the Hha\ system have previously been cloned and 
sequenced (U.S. Patent No. 4,999,293). Examination of the 
sequence showed a characteristic 5-methyl cytosine gene 
followed by an open reading frame on the complementary 
strand that was known to be the Hha\ restriction endonuclease. 
This system was used as a test to show that it would be 
possible to make a sufficient quantity of the restriction 
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enzyme in vitro to allow its detection using standard 
procedures. 

First, plasmid DNA encoding the Hha\ restriction system 
was prepared from E. cpli NEB691 (New England Bjojabs). The 
E. coli cells containing the recombinant plasmid were 
incubated in 10 ml LB in a roller at 37°C overnight. Cells were 
pelleted at 4,000 rpm for 30 sec at 4°C and the supernatant 
was discarded. The pellet was resuspended in 1 ml 1XGTE (50 
mM glucose, 25 mM Tris.HCI, 10 mM EDTA, pH 8.0) and lysed by 
adding 0.2 M NaOH, 1% SDS (2 ml). The precipitate was spun for 
3 min at 15,000 rpm at 4°C and the supernatant was 
transferred to a clean centrifuge tube. Isopropanol was added 
to the supernatant and it was incubated on ice for 10 min. The 
mixture was spun at 15,000 rpm for 5 min at 10°C and the 
supernatant was discarded. The pellet was dried and 
resuspended in 100 u.g/ml pancreatic RNase in 850 u.l 1XTE 
(10mM Tris.HCI, 1 mM EDTA, pH 8.0). The reaction was 
incubated at room temp, for 1 hour and spun at 14,000 rpm at 
4°C for 5 min. The supernatant was discarded and the pellet 
was resuspended in 100 |il 1XTE. The product was checked on a 
1 % agarose gel. 

Primers were synthesized with the following sequences: 



5'-pTAATACGACTCACTATAGGGAATAATTTTGTTTTAACTTTAA 
GAAGGAGAATGAAAATGAATTGGAAAG-3' (SEQ ID NO:3) 
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5'-pCAATTATAAAGAAATAGCTGCC-3' (SEQ ID NO:4) 

These primers were used in a set of 24 PCR reactions 
(100 ill each) that contained 0.8 mM each of the four dNTPs, 0.1 
plasmid DNA, 0.5 uM primer 3, 0.5 \iM primer 4, and 2 units 
vent DNA polymerase jn 1 x NEB ThermoPol Buffer. The 
reaction mix was heated at 95°C for three minutes, and then 
subjected to 25 rounds of PCR, incubating at 95°C for 30 
seconds, 46°C for 30 seconds, 72°C for 50 seconds. Finally the 
reaction was incubated at 30°C for two minutes. The PCR 
reactions were then combined, phenol/chloroform extracted 
and the DNA was precipitated and resuspended in 1X TE at 300 
jig/ml. 

The transcription and translation of the Hha\ gene PCR 
product was performed using a rabbit reticulocyte Protein 
Truncation Kit (Boehringer Mannheim). The PCR product 0.6 ug 
(2 uJ), transcription mix (2.5 u.l) and 5.5 nl of RNase free 
water were combined and incubated at 30°C for 30 min. The 
translation mix (40 u.l) was added and incubated at 30°C for 1 
hr. The transcription/translation mix was then tested for 
newly- formed restriction enzyme activity corresponding to 
the formation of Hha\ . 

Serial dilutions were performed by mixing 2 jil, 1 ul, 0.5 
and 0.25 \l\ transcription/translation product per 20 u.l 
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final reaction volume in 1X NEB buffer 4 containing 25 jig/jil 
substrate DNA. The reactions were incubated at 37°C for one 
hour. The reactions were analyzed on a 1.0 % agarose gel. As a 
positive control authentic Hha I (20 units, New England 
Biolabs, Inc.) was used to cut the substrate DNA under the 
same reaction conditions. As a negative control the DNA, was 
incubated with the transcription/translation mix to which no 
template DNA (PCR product) had been added. The agarose gel 
results showed that the substrate DNA was digested by the 
translation/transcription mix only when that mix had been 
primed with the Hha\ endonuclease PCR product. The banding 
pattern produced was identical to that produced by Hha\, thus 
demonstrating the utility of the in vitro 
transciption/translation system to product an active 
identifiable restriction endonuclease. 

EXAMPLE Oil 

A 2nd putative new restriction endonuclease from M. 
jannaschii (ORF 1328 - GTNNAC, M/alV): 

Another of the open reading frames that showed a good 
match to a known methylase gene was MJ1328. This gene is 
similar to the gene for M.H/ncll, which recognizes the sequence 
GTYRAC. The open reading frame immediately preceding 
MJ1328 shows some low similarity to the gene for the HincW 
restriction enzyme and so is a good candidate for a new 
restriction enzyme of the same or related specificity. This 
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open reading frame, MJ1327, comprises residues 1748-2485 of 
GenBank entry U67573. However, because M. jannaschii is a 
thermophile that normally grows at high temperatures, this 
new putative restriction enzyme encoded by the open reading 
frame MJ1327 may be anticipated to work at much higher 
temperatures than tf/ncll, isolated from the mesophile 
Haemophilus influenzae serotype c (Landy et al. Biochemistry 
13: 449-456, 1974). 

The ORF designated MJ1328 by TIGR (The Institute for 
Genomic Research), which comprises residues 3148 to 4044 of 
GenBank entry U67573, contains only the 3' portion of the 
believed methylase gene, which complete methylase gene 
would be found from position 2472 to 4044 of GenBank 
sequence U67573, with a frameshift present between 
positions 3148 and 3305. The 5' portion of this ORF, that not 
contained in the TI.GR designation, contains the methylase 
motifs (GxGxF and NPPY), while the whole has homology to 
M.HincW. 

To characterize MJ1327, the ORF was PCR amplified 
from genomic M. jannaschii DNA using the following two 
oligonucleotides as primers: 

forward (coding strand) primer, having a SamHI cloning site, 
T7 promoter sequence, and Nco\ cloning site: 
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5'-GTTGGATCCTAATACGACTCACTATAGGAACAGACCACCATGGTG 
GTAAAATTG GTTAATAAC-3' (SEQ ID NO:7) 

reverse primer having a BamH\ cloning site: 

5'-GTTGGATCCGATTGTAGAAAGATTTATCATTAATTC-3' 
(SEQ ID NO: 8) 

The PCR reaction was performed by combining: 
20 u.1 10X NEB ThermoPol Buffer (NEB), 16 |il dNTP solution 
(4mM), 15 jxl forward primer (10 \iM), 15 u.l reverse primer 
(10u.M), 135 |xt dH20, 1.5 \l\ M. jannaschii genomic DNA (100 
ng) mixing, 

then adding: 

4 \l\ Vent® exo- DNA polymerase, 1 \l\ Vent® DNA polymerase, 
dividing into 5 tubes of 40 ul each, adding 0.4, 0.8, 1.2, 1.6 uJ 
100mM MgS04 solution to one tube each to create reactions of 
2, 3, 4, 5 and 6 mM Mg++ concentrations. 

These five tubes were incubated at 95°C - 2 min for one 
cycle, 95°C- 30 sec, 52°C - 30 sec, 72°C - 1 min 15 sec for 5 
cycles, then 95°C- 30 sec, 58°C - 30 sec, 72°C - 1 min 15 sec 
for 27 cycles. 

Product was observed in the 4 and 5 mM Mg++ reactions. 
The product obtained was used as template and 15 more cycles 
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of amplification in a 500 |±l reaction as above was performed 
to obtain a larger quantity of PCR product. The amplified DNA 
was phenol/chloroform extracted and alcohol precipitated, 
then cleaved with BamHI, phenol-chloroform extracted, 
alcohol precipitated, resuspended in TE and ligated to pUC19 
DNA previously cleaved with BamHI and dephosphorylated. The 
ligated product was transformed into E. coli ER2170 cells by 
electroporation, and the transformed cells were grown in LB 
broth + 100 u.g/ml ampicillin overnight. A sample of these 
transformed cells, E. coli ER2170-pUG-MjalV, was deposited 
under the terms and conditions of the Budapest Treaty with 

the American Type Culture Collection on , 1998 and 

received ATCC Accession No. . 

The cells were then harvested by centrifugation, 
resuspended in sonication buffer (20 mM Tris, 1 mM DTT, 0.1 
mM EDTA, pH 7.5), lysed by sonication and the extract was 
clarified by centrifugation. This crude extract was assayed 
for restriction activity using X DNA in NEBuffer 4. Specific 
cleavage of A was observed and the restriction activity was 
purified by passing the crude extract through a heparin- 
sepharose column and step eluting the column with 0.5M and 
1M NaCI in sonication buffer. The purified restriction activity 
was mapped on pBR322, *X174 and M13mp18 DNAs, and the 
cleavage pattern was found to be consistent with cleavage at 
the sequence 5'-GTNNAC-3'. This new endonuclease was named 



WO 99/11821 



-47- 



PCT/US98/18124 



MjaN. The cleavage position within the recognition sequence 
was determined by the primer extension method using 
M13mp18 and primer NEB #1224 and found to be 5'-GTNiNAC- 
3', cleaving between the 2 N residues to produce blunt ends. 

The H/ncI I sequence, 5'-GTYRAC-3' ; , originally postulated 
for this restriction system, is a subset of the actual 
recognition sequence of MjaN , thus explaining the homology 
noted previously between MJ132B and the gene for M.HincW and 
MJ1327 and the gene for HincU.R. 

MjaN methylase (ORF MJ1328 plus 5' end) will be put into 
an appropriate vector and expressed in E. coliXo protect the E. 
coli host DNA from degradation by the MjaN endonuclease, which 
will be cloned into a strongly expressing, regujated vector, 
such as pET21 (T7X of pRRS. The MjaN endonuclease may then be 
produced by culturing the host carrying the gene for M/alV, 
inducing with appropriate conditions, harvesting the cells and 
purifying the MjaN endonuclease by a combination of standard 
protein purification techniques. 
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EXAMPLE IV 

A 3rd putative new irestirictooin endomsclease from M. 
jannaschii (ORF 1449 = GGNCC, MjaU)i 

Another of the open reading frames that showed a good 
match to a known methylase gene was MJ1448. This gene is' 
quite similar to the gene for M.Mva\, which recognizes the 
sequence CCWGG. At the time of the original analysis, the open 
reading frames on both sides of MJ1448 had no matches either 
to known restriction enzyme genes or to any other open reading 
frames present in GenBank. One of these was likely to be a 
restriction enzyme gene, and so both were tested using the 
methods of Example I. 

To test which of these open reading frames was the 
putative new restriction enzyme, a detailed protocol similar to 
that of Example I was employed. The segment of the genome of 
M. jannaschii containing the open reading frame MJ1447 
comprising residues 8643-9788 of GenBank entry U67585 was 
amplified using the following PCR primers: 

5'-GTTTAATACGACTCACTATAGGGTTAGGAGGTATTACAT(A)TG 
ATAAMTTTGGAGAAGCAGTTTTG-3' (SEQ ID NO:9) 

Note that the marked (A) is the start codon of the open 
reading frame. Sequences preceding the (A) are not present in 
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the genome, but contain the T7 RNA polymerase promoter 
sequence and a good ribosome binding site. 

5'-GTTGGATCCGTGTAAAGTTTTTTTGCTGGCTG-3' 
(SEQ ID NO:10) 

The product of this open reading frame were tested in a 
manner similar to that of Example I and was found not to be 
enzymatically active at cleaving DNA. 

The candidate ORF MJ1449 was identified as outlined 
above. The segment of the genome of M. jannaschii, comprising 
residues complementary to 11380-12492 in GenBank entry 
U67585, was amplified by PCR using the following two 
oligonucleotides as primers: 

5'-CCTCCTCTAGAAGAAGGAGATATACCATGCCACTAAGTAAAA 
ATGTTATAG-3 1 (SEQ ID NO:1 1 ) 

5'-GGAGGGATCCfCGAGCGCTTGACTGMTAGTTATTTTTGCAT 
ATATTTATTGTATAATTC-3* (SEQ ID NO:12) 

Using the protocol described in Examples IX and X below, 
ORF MJ1449 was stably cloned in DH5aF and the construction 
designated pLT7-M1449. When transformed into ER2566P 
(where "P" indicates the presence of pLysP), the protein 
expressed from this construct exhibited an activity consistent 
with that of a restriction endonuclease cleaving the sequence 
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GGNCC, at an assay temperature of 65°C. A sample of these 
transformed cells, E. coli ER2566P pLT7-M1449, was 
deposited under the terms and conditions of the Budapest 
Treaty with the American Type Culture Collection on 
, 1998 and received ATCC Accession No. . 

This activity was previously detected biochemically 
from crude lysates of M. jannaschii, and designated ft Mjall, 
but the gene was unknown. Induction of pLT7-M1449 at 37°C 
was lethal, indicating that the protein is also active at this 
temperature. 

EXAMPLE V 

Expression of R.Sf/0 in a coupled 
transcription/translation system from E. colh 

The restriction endonuclease Sffl from Streptomyces 
fimbriatus, recognizing the octanucleotide sequence 5'- 
GGCCNNNNiNGGCC-3', (SEQ ID NO:13) has been cloned and 
overexpressed in E. coli (U.S. Patent No. 5,616,484). The 
overexpression construct (Sf/4-2) consists of the SrVI DNA 
methyltransferase expressed on the vector pACYC184, under 
control of its own promoter, and the Sffl endonuclease 
expressed on a pUC19 derivative containing a 17 promoter, 
such that the gene is under control of either the P| a c promoter 
or the T7 promoter. Plasmid DNA was purified from a 4 liter 
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culture of E. coli ER1451 (Elisabeth Raleigh, New England 
Biolabs, Inc., Beverly, MA) harboring both plasmids using the 
alkaline lysis method followed by isopycnic banding in two 
successive cesium chloride gradients to remove all traces of 
contaminating chromosomal DMA. 

An S-30 extract was prepared from a 10-liter culture of 
E. coli strain D-10 (rna-10, relA1, spoT1, metBT, Gesteland, 
R.F., J. Mol. Biol. 16:67 (1966)), an RNase l-deficient K-12 
derivative, as described (Eliman, et al., Methods Enzymol. 
202:301-336 (1991)). 

In vitro protein synthesis reactions (30 \i\ final volume) 
contained the following: 56.4 mM Tris-acetate, pH 7.4; 1.76 
mM dithiothreitol; 36 mM ammonium acetate; 72 mM potassium 
acetate; 9.7 mM calcium acetate; 6.7 mM magnesium acetate; 
1.22 mM ATP (Na), 0.85 mM each of GTP (Na), CTP (Na), and UTP 
(Na); 27 mM potassium phosphoenol pyruvate; 0.35 mM each of 
the 20 amino acids; 19 mg/ml polyethylene glycol 8000; 35 
mg/ml folinic acid; 27 mg/ml pyridoxine-HCI; 27 mg/ml NADP; 
27 mg/ml FAD; 11 mg/ml p-aminobenzoic acid; 170 mg/ml E. 
coli tRNA; 100 \iglm\ SfiA-2 plasmid DNA; 25000 U/ml T7 RNA 
polymerase (where indicated) and 8.5 ul S-30 extract. 
Reactions were incubated at 37°C for 1 hour on a rotary shaker 
(200 rpm), cooled to 0°C, and centrifuged 1 minute to pellet 
precipitated proteins. 
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The reaction supernatants were then assayed for Sf/I 
activity in 25 u.1 reactions containing 1 u.g Adenovirus-2 
genomic DNA (35,937 bp) in NEBuffer 2 (10 mM Tris-HCI, pH 
7.9, 50 mM-NaCI, 10 mM MgCl2, 1 mM DTT), 100 jig/ml BSA, and 
three-fold serial dilutions (in NEBuffer 2) of the in vitro 
reaction supernatant. Reactions were incubated at 50°C for 60 
minutes and analyzed by agarose gel electrophoresis. As these 
reactions did not contain S-adenosylmethionine, a necessary 
cofactor for the Sf/1 DNA methyltransferase (MTase), any 
MTase synthesized in the translation reaction from the S//4-2 
DNA template would not be active during the endonuclease 
assay reaction. 

The results (Figure 2) demonstrate complete cleavage of 
Adenovirus-2 substrate DNA at the highest dilution tested 
(lane 12) for the T7 polymerase-directed translation reaction 
(0.03 ul of reaction supernatant), corresponding to a yield of 
synthesized Sf/I activity of at least 33000 units per ml of in 
vitro translation reaction. Assuming a specific activity of 
20,000 units/mg and a monomer molecular mass of 25 kDa, 
this corresponds to roughly 1,000 synthesized R.Sf/'l molecules 
per molecule of input DNA template. For the reaction without 
added T7 RNA polymerase, in which transcription was 
presumably from the weaker E. coli P| a c promoter, the yield of 
Sf/I activity was roughly 10-fold lower (cf. lanes 5 and 12), or 
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3000 units per ml, indicating that protein synthesis is 
transcription limited in this system. 

• EXAMPLE VP 

A new Mbol isoschlzomer from M. jannascfoii 
(ORF 600 - GATC, M/alll): 

The MJ600 ORF, comprising residues 5632 to 6504 of 
GenBank entry U67508, was predicted to encode an 
isoschizomer of Mbo\ on the basis of homology to Mbo\ and 
LlaW, as determined by the method of Example I. 

MJ600 was amplified and cloned in the same manner as 
MJ1327, by the method of Example 111, using as primers: 

(forward) 

5'-GTTGGATCCTAATACGACTCACTATAGGAACAGACCACCATG 
AATTTTGAATACATCATTAACAG-3' (SEQ ID NO: 13) 

(reverse) 

5'-GTTGGATCCAAATTGAATAATGGTATCATTCAC-3' 
(SEQ ID NO: 14) 

and the restriction activity was found to cleave at 5'-GATC-3\ 
This confirms that this ORF encodes an isoschizomer of Mbo\, 
as predicted. This isoschizomer, Mja\\\, from the 
thermostable organism M. jannaschii, can be expected to be 
significantly more thermostable than Mbo\. 
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EXAMPLE VII 

Expression of tf/ndlll in a coupled 
transcription/translation system ifrom £. co//: 

The genes encoding the restriction endonuclease and 
methylase of the HindUl system have previously been cloned 
and sequenced (U.S. Patent No. 5,180,673). The present 
invention's competence in identifying restriction 
endonucleases was further demonstrated by the use of the 
following standard procedures to make sufficient quantity of 
HindUl enzyme in vitro to allow its detection. 

First, plasmid DNA encoding the Hind\\\ restriction 
system was prepared from E. coli NEB 325 (New England 
Biolabs) by standard methods. 

Primers were synthesized with the following sequences: 

5'-CGAAATTAATACGACTCACTATAGGGAGACCACAACGGTTAA 0 
GGAGGTGACAAAAJGAAGAAAAGTGCGTTAGAG-3' 
(SEQ ID NO:15) 

5'-AAATGGATCCAGAATTATAAATACAGTCTATCATTAC-3' 
(SEQ ID NO:16) 

These primers were used in a set of 5 PCR reactions (100 
u.l each) that contained 0.2 mM each of the four dNTPs, 0.1 jig 
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plasmid DNA, 0.5 \lM of each above mentioned primer, and 2 
units Vent® DNA polymerase in 1X NEB ThermoPol Buffer (10 
mM KCI, 20 mM Tris-HCI (pH 8.8 at 25°C), 10 mM (NH 4 ) 2 S0 4l 4 
mM MgS0 4f 0.1% Triton X-100). The reaction mix was heated at 
95°C for 30, seconds, 55°C for 45 seconds, 72°C for 75 seconds 
for 20 cycles. Finally, the reaction was incubated at 72°C for 
10 minutes. The reactions were combined and 
phenol/chloroform extracted. The DNA was concentrated and 
primer dimer products partially removed by using a Microcon 
50 device according to the manufacturers instructions for 
3 rounds of 20-fold concentration and dilution. The purified 
PCR product was concentrated to 50 u,g/ml. 

The transcription and translation of the HindlU gene was 
performed using >r rabbit reticulocyte Protein Truncation Test 
Kit (Boehringer Mannheim). The PCR product (0.4 u.g (2 u.l)), 
transcription mix (2.5 u.l) and RNase free water (5.5 u.l) were 
combined and incubated at 30°C for 30 min. The translation 
mix (40 u.l) was added and incubated at 30°C for 1 hr. The 
transcription/translation reaction was then tested for newly 
formed HindlU restriction enzyme activity. 

Serial dilutions of the transcription/translation reaction 
were performed in NEB buffer 2 (50 mM NaCI, 10 mM Tris- 
acetate, 10 mM MgCI 2 , 1 mM dithiothreitol, 100 u.g/ml BSA) 
containing 25 ug/ml A phage substrate DNA using 1.6 u.l, 0.53 
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H.I, 0.17 |il or 0.06 \i\ transcription/translation reaction 
product per 20 jil final reaction volume in 1X NEB buffer 2 
containing Jt DNA. The reactions were incubated at 37°C for 14 
hours. As a positive control, authentic HindM (20 units, New 
England Biolabs, Inc.) was used to cut the substrate DNA under 
the same reaction conditions. As a negative control, the DNA 
was incubated with the transcription/translation mix to which 
no template DNA (PCR product) had been added. 

Hind\\\ restriction activity was clearly observed in the in 
vitro transcription/translation reaction, demonstrating the 
efficacy of the in vitro method described in the instant 
application. 

EXAMPLE VIII 

In Vitro Transcription/Translation of Pad Restriction 

Endonuclease: 

The gene encoding the Pad restriction endonuclease has 
previously been cloned and sequenced (Richard D. Morgan, New 
England Biolabs, Inc., unpublished observations). It has been 
observed that clones of Pad are unstable in E. coli, presumably 
due to the lack of a Pad methylase on these clones. The 
present invention's competence in identifying restriction 
endonucleases was further demonstrated by the use of the 
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following standard procedures to make sufficient quantity of 
Pad enzyme in vitro to allow its detection and identification. 

First, Pseudomonas aicaligenes genomic DNA was 
obtained from NEB 585 (New England Biolabs, Inc., Beverly, MA). 
See also U.S. Patent No. 5,098,839. 

Primers were synthesized with the following sequences: 

5'-GTTGGATCCTAATACGACTCACTATAGGAACAGACCACCATG 
ACGCAATGTCCAAGGTG-3' (SEQ ID NO: 17) 

5'-GTTGGATCCGTCGACTTGGCAMGCCCTCTTC-3' 
(SEQIDNO:18) 

These primers were used in a set of 8 PCR reactions (100 
u.l each) that contained 0.2 mM each of the four dNTPs, O.i ug 
genomic DNA, 0.5 jiM of each above mentioned primer, and 2 
units Vent® DNA polymerase in 1X NEB ThermoPol Buffer (10 
mM KCI, 20 mM Tris-HCI (pH 8.8 at 25°C), 10 mM (NH 4 ) 2 S0 4 , 4 
mM MgS0 4 , 0.1% Triton X-100). The reaction mix was heated at 
95°C for 30 seconds, 57°C for 30 seconds, 72°C for 65 seconds 
for 27 cycles. The PCR reactions were combined and a 
standard phenol/chloroform extraction was carried out to 
remove protein. The DNA was concentrated and primer dimer 
products partially removed using an Amicon Microcon-50 
device as in Example VII. 
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The transcription of the Pad gene was performed using a 
rabbit reticulocyte Protein Truncation Test Kit (Boehringer 
Mannheim). The PCR product 0.4 u.g (2 u.1), transcription mix 
(2 4 5 and 5.5 uJ of RNase free water were combined and 
incubated at 30°C for 45 min. Transcription mix (8 u.1) 
containing m 7 G(5')ppp(5')G 5' capped mRNA was added to 42 u.1 
of Ambion T/T Wheat Germ translation mix (11 u.l RNase free 
water, 2.5 u.l 1M KOAc, 3.5 \i\ Amino Acid Mix, 25 |il 
Translation extract) and incubated at 27°C for 1 hr. The 
transcription/translation reaction was then tested for newly 
formed Pad restriction enzyme activity. 

Substrate DNA was digested by the 
transcription/translation mix only when that mix had been 
primed with PCR product from the Pseudomonas alcaligenes 
genomic DNA. The lanes with primed transcription/translation 
product produced banding patterns identical to the lanes with 
authentic Pad, again demonstrating the efficacy of the method 
described in the instant Application. 
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EXAMPLE CX 

Stable Cloning of Pad Restriction Endomuciease: 

The restriction endonuclease Pad has been previously 
characterized biochemically and shown to recognize the 
sequence TTAATTAA. Despite repeated attempts, the gene has 
not been usefully cloned due to the apparent lack of a cognate 
methylase, and the inherent lethality of the gene product. The 
gene encoding Pad was used as a test to show that it would 
be possible to: 1) establish a stable clone of a gene encoding a 
lethal protein, and 2) show that the expression of such a 
cloned gene could be electively modulated using standard 
laboratory techniques. 

Genomic DNA-from Pseudomonas alcaligenes (NEB 585) 
was prepared by standard methods. 

Primers were synthesized with the following sequences: 

5'-CCTCCTCTAGAAGAAGGAGATATACCATGACGCAATGTCCAA 
GGTGCC-3' (SEQ ID NO:19) 

5'-GGAGGGATCCTCGAGCGCTTGACTGAATAGTTAGG-3' 
(SEQ ID NO:20) 

Approximately 0.5 jxg of the P. alcaligenes DNA was used 
as template in a 100 \l\ PCR reaction containing 0.2 mM each of 
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the four dNTPs, 100 pmol of each primer, 4 units of Vent® DNA 
polymerase (VDpol) in 1X NEB ThermoPol Buffer. The reaction 
mix was heated to 94°C for 2 minutes, and then subjected to 25 
cycles of PCR, incubating at 94°C for 1 minute, 58°C for 30 
seconds, and-72?C for, 30 seconds.- Finally the reaction was held 
at 72°C for five minutes. 10% of the reaction product was 
checked on a 1 % agarose gel, and the balance stored at -20°C 
until further use. The reaction was subjected to standard 
phenol/chloroform/isoamyl alcohol, then chloroform 
extractions to partition the protein and the Pad amplicon (DNA 
product of the PCR reaction). The amplicon was precipitated 
from the aqueous fraction by supplementing it with sodium 
acetate (pH 5.2) to 0.3 M, addition of 2.5 volumes of absolute 
ethanol, and storage at -20°C overnight. The amplicon was 
recovered by centrifugation at 14,000 rpm at 4°C for 20 
minutes, at which point the supernatant was discarded. After 
allowing the DNA pellet to dry, it was redissolved in 50 u.l of 10 
mM Tris-HCI, pH 7.4. 

Approximately 2 u.g of the amplicon was incubated for 2 
hours at 37°C in a 50 u.l restriction endonuclease reaction 
containing 1.0 mg/ml bovine serum albumin (BSA), 40 units 
each of Xba\ and Xho\, in 1X NEB buffer #2. 50 p.l of 10 mM 
Tris-HCI, pH 7.4 was added to the reaction to make the volume 
100 |il. The reaction was subjected to phenol/chloroform and 
ethanol precipitation as described above. The pellet was 
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dissolved in 25 p.l of 10 mM Tris-HCI, pH 7.4. The resulting 
DNA preparation was electrophoresed on a 1 % agarose gel, the 
desired band excised, and eluted from the agarose matrix. 
Approximately 0.5 jig of pLT7K was prepared in a similar 
manner. - Tire eluates were mixed, then subjected to 
phenol/chloroform and ethanol precipitation as described 
above. The dry DNA mixture was dissolved in 20 ul 1X NEB 
ligase buffer and incubated with 800 units of T4 DNA ligase at 
16°C overnight. 

The ligation was subjected to phenol/chloroform and 
ethanol precipitation as described above, and dissolved in 30 
ul of 10 mM Tris-HCI, pH 7.4. 10 \i\ of this preparation was 
added to 85 of electrocompetent E. coli strain DH5aF (LTI) 
on ice. Electroporation was done in a 0.1 cm cuvette chamber 
.using a BioRad Genepulser (model #1652102) set at 1.88 
kvolts. The contents of the cuvette were removed into a 1 .5 
ml tube containing 0.5 ml Luria broth supplemented to 20 mM 
glucose (LB-glc) that had been prewarmed to 42°C. The tube 
was placed into a 40°C shaker for approximately 45 minutes, 
at which point it was removed to a 42°C heat block. Three 
fractions of the preparation (2%, 20%, and 78%) were spread 
onto LB-glc agar plates (prewarmed to 40°C) containing 100 
ng/ml ampicillin (LB-glc-Ap). Plates were incubated at 40°C 
overnight. 
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The following day, ten transformant colonies were 
randomly picked and dispersed into 5 ml of prewarmed LB-glc- 
Ap media. These cultures were incubated overnight in a 40°C 
shaker, at which point plasmid DNA was isolated by standard 
procedures. Plasmid DNAs were screened by restriction 
digest. 7 out of the 10 selected clones had the desired 
construction: 

PT7 -> Pad coding region--> <-- Pl/R- 

Putative positives were subjected to single-pass 
sequencing reactions of the 5'-end of the insert. Five of the 
seven displayed no deviation from the expected sequence, and 
a representative clone, designated pLT7-Pac.3, was selected 
for further characterization. 

pLT7-Pac.3 was transformed into E. coli strain ER2566P 
using a variation of a standard chemical method. 
Approximately 0.05 \lq of plasmid DNA was incubated with 
100 ]i\ of cells for 30 minutes on ice. The mixture was 
warmed to 42°C for two minutes, at which point 0.9 ml of LB- 
glc was added that had been prewarmed to 42°C. The tube was 
placed into a 40°C shaker for approximately 30 minutes, at 
which point it was removed to a 42°C heat block. Two 
fractions of the preparation (2% and 20%) were spread onto 
LB-glc agar plates (prewarmed to 40°C) containing 100 \ig/m\ 
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ampicillin (LB-glc-Ap). Plates were incubated at 40°C 
overnight. The following morning, three transformant 
colonies were randomly picked and dispersed into 5 ml of 
prewarmed LB-glc-Ap media. The cultures were incubated for 
approximately 4 hours in a 40°C shaker, at which point 2.5 ml 
of each was added to 500 ml of prewarmed LB-glc-Ap media, 
and incubated in a 40°C shaker until the culture had attained 
an O.D. 600nm of approximately 0.7. IPTG was added to a final 
concentration of -0.8 mM, the shaker temperature was 
adjusted to 30°C, and the culture incubated for an additional 4 
hours. Approximately 1 g of cells was recovered by 
centrifugation (6000 rpm, 4°C, 15 minutes) and stored at - 
70°C overnight. 

The cell pellet was suspended (on ice) in 20 ml of a 
buffer (Pad core buffer) consisting of: 20 mM KPO4, pH 6.0; 50 
mN NaCI; 10 mM 6-mercaptoethanol; 0.1 mM EDTA; 5 % 
glycerol. Cells were lysed by the addition of Triton X-100 to 
0.1 %, lysozyme to 1 jig/ml and, after warming briefly to 
20°C, alternating sonication/cooling on ice. The preparation 
was clarified by centrifugation (10,000 rpm, 20 minutes, 
4°C), and the supernatant removed to a fresh tube on ice. 

The cleared lysate was applied to a heparin-sepharose 
column that had been previously equilibrated with Pad core 
buffer. This was followed by an 8 column-volume wash. The 
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flow-through and the wash fractions were collected and 
maintained on ice, as well as a small amount of the cleared 
lysate. The column was developed with a 50 ml gradient from 
0.05 - 1.0 M NaCI. 1.0 ml fractions were collected and 
maintained on ice. ■ • 

A low level of endonuclease activity consistent with 
that of Pad was detected in fractions distributed across the 
elution gradient. This indicated that the protein had bound 
poorly to the column and suggested that the protocol employed 
here, which had been optimized for P. alcaligenes lysates, was 
not optimal for E. coli lysates. Accordingly, the crude lysate 
and column flow-through were assayed for Pad activity, 
where it was clearly evident. 

To test whether pLT7-Pac.3 would be stable and 
electively inducible in a production-scale expression system, a 
20 liter culture was grown under conditions similar to those 
outlined above. A fresh transformation of pLT7-Pac.3 into 
ER2566P was done as outlined above. A colony was randomly 
selected, dispersed into 1 liter of media, and incubated in a 
40°C shaker overnight. This was used to inoculate a 20 liter 
fermenter run. At an OD600 of -1.0, IPTG was added to a final 
concentration of 0.3 mM, the temperature reduced to 30°C, and 
incubation continued for an additional 4 hours. 38 grams of 
cells were harvested by continuous flow centrifugation and 
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stored at -70°C for 19 days. A sample of these transformed 
cells, ER2566P-pLT7-Pac.3, was deposited under the terms and 
conditions of the Budapest Treaty with the American Type 

Culture Collection on , 1 998 and received ATCC 

Accession No. . 

A clarified extract was prepared and partitioned over a 
heparin-sepharose column with a 0.05 - 1.0 M NaCI gradient. 
This procedure yielded >800 units of Pac\ endonuclease/g of 
wet cells. 

EXAMPLE X 

Stable Cloning of NlalU Restriction EndonucOease: 

Example IX -illustrates that pLT7K enabled the 
establishment of a stable clone encoding Pad endonuclease, 
and that expression of this protein could be electively 
modulated. The octanucleotide recognition sequence for Pac\ 
does not occur in pLT7K. It is possible that the plasmid would 
be less stable if it were used to clone a gene encoding a 
restriction endonuclease capable of cleaving at one or more 
sites within the construct. Therefore, the reliability of pLT7K 
was subjected to a high stringency test by cloning the gene 
encoding restriction endonuclease NlalU (R./v7alll), absent the 
use of the Nla\\\ cognate methyltransferase (M./V/alll). 

I 
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R./v7alll has been previously characterized biochemically 
and shown to recognize the sequence CATG (U.S. Patent No. 
5,278,060). 

The /V/alll restriction-modification system has also been 
previously cloned and sequenced, and the genes encoding M.- 
and R.NIalll (nlalllU and n/a///R, respectively) identified (U.S. 
Patent No. 5,278,060). In vivo, plasmid-borne alleles of 
nlalllR exhibit instability, even when M.A//alll is expressed 
from a co-resident plasmid. in the absence of the cognate 
methylase, an nlalllR clone cannot be established using 
standard methods. 

Using standard methods, plasmid DNA was prepared from 
cells that produce both M.- and R./V/alll from separate 
plasmids. 

Primers were synthesized with the following sequences: 

5'-CCTCCTCTAGAAGAAGGAGATATACCATGAAAATCACAAAAA 
CAGAACT-3' (SEQ ID NO: 21 ) 

S'-GGAGGGATCCTCGAGCGCTTGACTGAATAGTCATCCGTTATCTTC 
TTCATATAATTTC-3' (SEQ ID NO: 22) 

These primers were used to generate an n/a///R amplicon 
containing sequences suitable for expression and directional 
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cloning into pLT7K. Using the protocol described above, a gene 
encoding R. AZ/alll was cloned into the pLT7 vector, with 87% 
(13/15) recovery of the desired construct (designated pLT7- 
Nla\\\). The clone could be established and stably maintained 
in both DH5aF li and ER2566P. 

Addition of IPTG (to 1.0 mM) to 5 ml cultures of 
ER2566P-pLT7-/V/a!ll resulted in rapid cessation of cell 
growth, as compared to controls. One hour after IPTG 
addition, crude lysates were prepared using standard methods. 
When assayed, an endonuclease activity consistent with that 
of R.A//alll was apparent. 

Thus, pLT7K can be used to clone, maintain, and 
electively express genes whose products are capable of 
destroying the construct itself. 

EXAMPLE XI 

MyaV, A new [restriction endonuclease from M. 
jannaschii which recognizes 5'-GTAC-3' 

The open reading frame MJ1498, which comprises 
residues 9251 to 10129 of GenBank entry U67590, was 
identified as a likely methylase gene candidate, by virtue of 
its having amino acid sequences characteristic of amino 
methyltransferases; VTSPPY (SEQ ID NO: 24) and VLDPFMGIGST 
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(SEQ ID NO:25). The flanking ORFs, MJ1497 and MJ1499, were 
considered as possible endonuclease genes. A match in the 
database for ORF MJ1497 made this ORF seem a less likely 
candidate, but both MJ1497 and MJ1499 were PCR amplified 
from, genomic M. jannaschii DNA and cloned into the T7 
expression vector pAII17 in E. coli. Neither MJ1497 nor 
MJ1499 showed any restriction activity in the pools of clones 
prepared. The MJ1498 putative methylase gene was PCR 
amplified from genomic M. jannaschii DNA using the following 
, two oligonucleotides as primers: 

forward (coding strand) BamHI cloning site, (A/del cloning 
site): 

5'-GTTGGATCCGTAATTAAGGAGGTAATTCATATGGAGATAAAT 
AAAATCTAC-3' (SEQ ID NO:26) 

reverse: Sa/I (EcoRI) cloning site: 
5'-GTTGAATCCGTCGACTATTTAAATAAATGCATC-3' 

(SEQ ID NO: 27) 

The PCR reaction was performed by combining: 

20 ul 10X ThermoPol Buffer (New England Biolabs, Inc.) 

16 ul dNTP solution (4mM) 

15 ul forward primer above (10uM) 

15 ul reverse primer above (10uM) 

133 ul dH 2 0 

1 .5 ul M. jannaschii genomic DNA 
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4 ul Vent® exo- DNA polymerase 
1 ul Vent® DNA polymerase 

This master reaction mix was divided into 5 tubes of 40 ul 
each, to which were added 0.0, 0.4, 0.8, 1.2 and 1.6 ul of 
100mM MgSOi solution per tube to create reactions of 2, 3, 4, 

5 and 6 mM Mg** concentrations. 

These five tubes were incubated 95°C - 2 min for one cycle, 
95°C- 30 sec, 48°C - 30 sec, 72°C - 1 min for 5 cycles, 
then 95°C- 30 sec, 58°C - 30 sec, 72°C - 1 min for 25 
additional cycles. The amplified DNA was phenol/chloroform 
extracted, alcohol precipitated and resuspended in TE buffer. 
A portion of the amplified DNA was then cleaved with BamH\ 
and Sal\, phenol-chloroform extracted, alcohol precipitated 
and resuspended in TE. The cleaved DNA was then ligated to 
vector pSYX20 DNA previously cleaved with BamH\ and Sa/I and 
gel purified. The ligated product was transformed into £. coli 
ER2566 cells and the transformed cells were grown overnight 
on LB plates containing 50 ug/ml kanamycin. Individual 
transformants were examined and minipreps of several clones 
containing the desired size insert were prepared. The cloned 
DNA was digested with various restriction enzymes in an 
attempt to find an enzyme which would cleave the pSYX20 
vector but be unable to cut the MJ1498 clone, thus 
demonstrating that the cloned MJ1498 ORF was functioning as 
a methyltransferase to protect the vector DNA containing the 
MJ1498 gene against cleavage by that particular restriction 
endonuclease. It was found that the clones of MJ1498 were 
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not cleaved by the restriction endonuclease Rsa\, indicating 
that the methylase was protecting the GTAC sequence 
recognized by Rsa\ against cleavage. This showed that 
MJ1498 was able to function as a methyltransf erase, as 
predicted, in E. coli. The methyltransferase activity could be 
methylating at GTAC, or GTAC could be a subset of the 
methyltransferase target sequence. To look for a cognate 
restriction activity, it was observed that the orf once 
removed from MJ1498, MJ1500, did not significantly match 
anything in the database by BLAST search. The possibility 
that an endonuclease might be one ORF removed from its 
cognate methylase was strengthened by the observation that 
MJ598 is the methylase and MJ600 is the endonuclease in the 
Myall I system described above. The MJ1500 ORF, which 
comprises residues 767 to 74 of GenBank sequence U67591. 
was amplified from genomic M. jannaschii DNA using the 
following two oligonucleotides as primers: 

forward (coding strand) BamHI cloning site, T7 promotor, 
kozak sequence: 

5'-GTTGGATCCTAATACGACTCACTATAGGAACAGACCACCATG 
GATGATAAGAGCTACTATG-3' (SEQ ID NO:28) 

reverse: 

5'-CATTAATATATAAATAAATACATAAAT-3' (SEQ ID NO: 29) 
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The PCR reaction was performed by combining: 

20 ul 10X PCR BUFFER II (PE) 

12 ul dNTP solution (4mM) 

15 ul forward primer above (10uM) 

15 ul reverse primer above (10uM) 
1.5 ul M. jannaschii genomic DNA 

16 ul MgCI 2 (25 mM stock) (PE) 
122 ul dH 2 0 

2 ul (10u) AmpliTaq DNA polymerase (PE) 

This master reaction mix was divided into 2 tubes of 
100 ul each, to which were added 0.0 and 8 ul of 25mM MgCI 2 
solution per tube to create reactions of 2 and 4 mM Mg ++ 
concentrations. 

These tubes were incubated at 95°C for 2 min for one 
cycle, then 95°C- 30 sec, 40°C - 30 sec, 72°C - 1 min for 5 
cycles, followed by 95°C- 30 sec, 48°C - 30 sec, 72°C - 1 min 
for 25 additional cycles. The amplified DNA was . 
phenol/chloroform extracted, alcohol precipitated and 
resuspended in TE buffer at a concentration of 200ug/ml. The 
amplified MJ1500 ORF was used for in vitro 
transcription/translation reactions as described above in 
Example I. The in vitro transcription/translation product was 
found to cut DNA at the sequence GTAC, demonstrating that 
MJ1500 is the cognate endonuclease to the MJ1498 methylase, 
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and that this restriction system recognizes the sequence 5'- 
GTAC-3'. 



EXAMPLE XII 



A putative new restriction endonuclease from M. 
jannaschii (ORF 1200/1199 - not yet identified): 

During the search of the M. jannaschii genome sequence, 
as outlined in Example I, several open reading frames were 
identified that appeared to encode DNA methylase genes and 
were candidates to be part of Type II restriction-modification 
systems. One of these was the open reading frame labelled 
MJ1200, which showed the closest match to the known gene 
encoding the methylase M.Ddel. From the characteristic motifs 
(Posfai, et al., Nucl. Acids Res. 17:2421-2435 (1989)); 
Lauster, et al. J. Mot. Biol 206:305-312 (1989)) present in this 
gene it is predicted to encode a cytosine-5 DNA methylase. 
However, because the variable region of this putative gene is 
not a good match for anything in the database it is possible 
that it recognizes a new DNA sequence. Immediately following 
this gene is an open reading frame that shows a good match to 
a ribosomal protein (L24E), while preceding the gene is an open 
reading frame (MJ1199) with no clear similarity to any other 
open reading frame present in GenBank. This open reading 
frame, MJ1199, is predicted to encode a new restriction 
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enzyme and comprises the complementary strand residues 
9158-10258 of the GenBank entry U67561. 

To characterize the putative new restriction enzyme 
encoded by MJ1199, a detailed protocol similar to that of 
Example I will be employed. The segment of the genome of U. 
jannaschii containing the open reading frame MJ1199 will be 
amplified by PCR using as primers the following two 
oligonucleotides: 

5'-GTTTAATACGACTCACTATAGGGTTAGGAGGTATTACAT 
(A)TGAGAAAAATGTTTATTTGTTTGC-3' (SEQ ID NO:5) 

Note that the marked (A) is a G in the original genome. It 
is changed to an A to ensure a better translational start. This 
is the start codon of the open reading frame. Sequences 
preceding the (A) are not present in the genome, but contain 
the T7 RNA polymerase promoter sequence and a good ribosome 
binding site; 

5-GTTGGATCCGGAGATTCCTGAGGCATCTTTG-3' 
(SEQ ID NO:6) 

The PCR-amplified segment will be subjected to in vitro 
transcription/translation as detailed in Example I and the 
product will be tested for restriction enzyme activity by 
incubating the transcription/translation mix with various 
DNAs such as those of bacteriophages X and T7 and Adenovirus- 
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2. Incubations will be at various temperatures, ranging from 
30°C to 90°C and for various lengths of time. After incubation 
the reactions will be examined by agarose gel electrophoresis 
to see if banding patterns, characteristic of restriction 
enzyme digestion, are present. If they are, then the new 
restriction enzyme will be characterized as to its recognition 
sequence and cleavage site in the usual way (Schildkraut, I.S., 
"Screening for and Characterizing Restriction Endonucleases", 
in Genetic Engineering, Principles and Methods, Vol. 6, pp. 117- 
140, Plenum Press (1984); Roberts, R.J. and Halford, S.E. in 
Nucleases [Eds. Linn! S.M., Lloyd, R.S., Roberts, R.J.] Cold Spring 
Harbor Press, pp 35-88 (1993)). 
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What is claimed is: 

1. A method for identifying a restriction endonuclease 
comprising the steps of: 

(a) screening a target DNA sequence for the presence 
of known DNA methylase sequence motifs; 

(b) identifying any open reading frames which lie close 
to the methylase sequence motifs screened in step 
(a); and 

(c) assaying the protein products of the open reading 
frames of step (b) for restriction endonuclease 
activity. 

2. The method of claim 1 , wherein the target DNA sequence 
is selected from the group consisting of bacterial DNA 
sequences, archaeal DNA sequences and viral DNA 

- sequences. 

3. The method of claim 1, wherein the screening of step(a) 
comprises searching DNA sequence databases. 

4. The method of claim 1, wherein the protein products of 
step(c) are produced by in vitro transcription and 
translation. 
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5. The method of claim 4, wherein the restriction 
endonuciease is a thermophilic restriction endonuclease 
and the translation mix is selected from the group 
consisting of a Wheat Germ translation mix, a bacterial 
S30, or a rabbit reticulocyte system. 

6. The method of claim 1, wherein the protein products of 
step(c) are produced by recombinant DNA techniques. 

7. The method of claim 1, wherein step (c) further 
comprises the steps of: 

(d) growing the original microorganism; 

(e) preparing cell extracts; and 

(f) testing the extracts of step(e) for restriction 
endonuclease activity. 

8. The method of claim 1, wherein the methylase sequence 
motifs of step(a) are selected from the group consisting 
of cytosine-5 methylase motifs, N4C-methylase motifs, 
and N6A-methylase motifs. 

9. A substantially pure restriction endonuclease M/alV 
obtainable from M. jannaschii, said endonuclease 
recognizing the following base sequence in double- 
stranded deoxyribonucleic acid molecules: 
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5' - GTNiNAC - 3' 
3' - CANtNTG - 5* 
and having a cleavage position defined by the arrows. 

10. Isolated DNA coding for the M/alV restriction 
endonuclease, wherein the isolated DNA is obtainable 
from Methanococcus jannaschii. 

11. A recombinant vector comprising a vector into which DNA 
coding for MjaN restriction endonuclease has been 
inserted. 

12. The recombinant vector of claim 11 wherein the DNA 
comprises residues 1748-2485 of GenBank Entry U67573. 

13. A host cell transformed with the recombinant vector of 
claim 11 or 12 . 

14. A method of producing M/alV restriction endonuclease 
comprising culturing a host cell transformed with the 
vector of claim 11 or 12 under conditions suitable for 
expression of said endonuclease. 

15. Isolated DNA coding for the MjaU restriction 
endonuclease, wherein the isolated DNA is obtainable 
from M. jannaschii. 
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16. A recombinant vector comprising a vector into which DNA 
coding for Myall restriction endonuclease has been 
inserted. 

17. The recombinant vector of claim 16 wherein the DNA 
comprises residues 11380-12492 of GenBank Entry 
U67585. 

18. A host cell transformed with the recombinant vector of 
claim 16 or 17. 

19. A method of producing MjaU restriction endonuclease 
comprising culturing a host cell transformed with the 
vector of claim 16 or 17 under conditions suitable for 
expression of said endonuclease. 

20. A substantially pure Mja\\\ restriction endonuclease 
obtainable from M. jannaschii, which recognizes the 
following base sequence in double-stranded 
deoxyribonucleic acid molecules: 

5' - GATC - 3' 
3' - CTAG - 5' 

and wherein said endonuclease is an isoschizomer of 
Mbol 
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21. Isolated DNA coding tor the Mja\\\ restriction 
endonuclease, wherein the isolated DNA is obtainable 
from M. jannaschii. 

22. A recombinant vector comprising a vector into which DNA 
coding for Mja\\\ restriction endonuclease has been 
inserted. 

23. The recombinant vector of claim 22, wherein the DNA 
comprises residues 5632-6504 of GenBank Entry U67508. 

24. A host cell transformed with the recombinant vector of 
claim 22 or 23. 

25. A method of producing the Myalll restriction 
endonuclease comprising culturing a host cell 
transformed with the vector of claim 22 or 23 under 
conditions suitable for expression of said endonuclease. 

26. A method of identifying an isoschizomer of a known 
restriction endonuclease, said isoschizomer possessing 
a desired physical property, said method comprising the 
steps of: 

(a) identifying any open reading frames in genomic DNA 
encoding said known restriction endonuclease; 
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(b) comparing said open reading frames of step(a) 
against known open reading frames in at least one 
organism possessing said desired physical property 
to identify potential sequence matches; and 

(c) assaying the protein products of said candidate 
isoschizomer sequences of step(c) for restriction 
endonuclease activity under conditions selective 
for said desired physical property. 



27. The method of claim 26, wherein said desired physical 
property is selected from the group consisting of 
thermostability, halostability, acidostability, and 
cryostability. 

28. The method of claim 26, wherein said screening of step 
(b) comprises searching DNA sequence databases. 

29. The method of claim 26, wherein said protein products of 
step (c) are produced by in vitro transcription and 
translation. 



30. The method of claim 29, wherein said isoschizomer is 
thermostable and said translation mix is selected from 
the group consisting of a Wheat Germ translation mix, a 
rabbit reticulocyte system, or a bacterial S30. 
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31. The method of claim 26, wherein said protein products of 
step (c) are produced by recombinant DNA techniques. 

32. A vector suitable for cloning a DNA sequence encoding a 
cytotoxic protein wherein, said vector comprises at least 
a first and a second transcription promotor and is 
adapted to accept said DNA sequence insert and wherein 
said first and said second transcription promotors are 
independently controllable. 

33. The vector of claim 33, wherein said first transcription 
promotor enables anti-sense strand transcription and 
said second transcription promotor enables sense strand 
transcription. 

34. The vector of claim 33, wherein said first transcription 
promotor comprises A phage promotor and said second 
transcription promotor comprises T7 RNA polymerase 
promotor. 

35. The vector of claim 34, wherein said vector comprises 
pl_T7K. 

36. The vector of claim 34, wherein said independent control 
of said first and second transcription promotors 
comprises control by a member of the group consisting of 
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temperature, IPTG addition, and inhibition of at least one 
RNA polymerase required for transcription of said 
vector. 

37. The vector of claim 36 wherein said inhibition comprises 
inhibition by a member of the group consisting of 
bacteriophage T7 lysozyme expression, and utilization of 
a T7 RNA polymerase negative E. coli strain. 

38. > An E. coli host cell transformed by the vector of any one 

of claims 32, 33, 34, 35, 36 and 37. 

39. A method for producing a recombinant cytotoxic protein, 
said method comprising the steps of: 

(1) inserting a DNA sequence encoding said cytotoxic 
protein into the vector of any one of claims 32, 33, 34, 
35, 36, and 37; 

(2) transforming a host cell with the vector of step (1) 
under conditions which disallow the expression of said 
sense strand; 

(3) culturing said transformed host cell of step (2) 
under conditions which disallow the expression of said 
sense strand; 

(4) inducing the selective expression of said sense 
strand; and 

(5) recovering said recombinant cytotoxic protein. 
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40. The method of claim 39, wherein said induction of 
step(4) comprises induction by a member of the group 
consisting of temperature, IPTG addition, and inhibition 
of at least one RNA polymerase required for 
transcription of said vector. 

41. A stable recombinant vector encoding R./V/alll, said 
vector comprising pLT7-n/a///R, wherein said vector 
does not encode M./V/alll. 

42. A host cell transformed by the vector of claim 41. 

43. Isolated DNA coding for the Pac\ restriction 
endonuclease, wherein the isolated DNA is obtainable 
from ATCC Accession No. 55044. 

44. A recombinant DNA vector comprising a vector into 
which a DNA segment coding for Pad endonuclease 
produced by Pseudomonas alcaligenes has been inserted. 

45. A cloning vector which comprises a vector into which 
the isolated DNA of claim 43 has been inserted. 

46. A host cell transformed by the vector of claim 44. 
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47. A method of producing Pad restriction endonuclease 
comprising culturing a host cell transformed with the 
vector of claim 44 under conditions suitable for 
expression of said endonuclease. 

48. A substantially pure M/aV restriction endonuclease 
obtainable from M. jannaschii, which recognizes the 
following base sequence in double-stranded 
deoxyribonucleic acid molecules: 

5'-GTAC-3' 
3'-CATG-5' 

and wherein said endonuclease is an isoschizomer of 
fisai. 
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49. Isolated DNA coding for the M/'aV restriction 
endonuclease, wherein the isolated DNA is obtainable 
from M. jannaschii. 

50. A recombinant vector comprising a vector into which 
DNA coding for M/aV restriction endonuclease has been 
inserted. 

51 . The recombinant vector of claim 50, wherein the DNA 
comprises residues 767-74 of GenBank Entry U67591. 

52. A host cell transformed with the recombinant vector of 
claim 50 or 51. 
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53. A method of producing the MjaV restriction endonuclease 
comprising culturing a host cell transformed with the 
vector of claim 50 or 51 under conditions suitable for 
expression of said endonuclease. 
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LEGEND 

c/857- GENE ENCODING COLIPHAGE LAMBDA TRANSCRIPTIONAL REPRESSOR PROTEIN 

P L/R - COLIPHAGE LAMBDA TRANSCRIPTIONAL PROMOTER SEQUENCE 

O cl cl REPRESSOR BINDING SITE 

MCSl ONE OF TOO MULTIPLE CLONING SITES 

kanR GENE DETERMINING KANAMYCIN RESISTANCE 

MCS2 ONE OF TOO MULTIPLE CLONING SITES 

P T7 COLIPHAGE T7 TRANSCRIPTIONAL PROMOTER SEQUENCE 

0 lac Lacl REPRESSOR BINDING SITE 

lacl GENE ENCODING E. COLI TRANSCRIPTIONAL REPRESSOR PROTEIN 
rop GENE ENCODING PLASHID REPLICATIVE CONTROL PROTEIN 
ori PLASMID REPLICATIVE ORIGIN 
bla GENE DETERMINING AMPICILLIN RESISTANCE 
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